Discussion on Low-stream HD Solution in Safe City

Ping An City has entered a new round of construction, especially in some first-tier cities. Ping An City has more and more coverage, not only traditional video surveillance and alarm management systems, but also traffic electronic police, emergency systems, The inclusion of urban management such as parking management has increased the height of digital cities and smart cities.

This article refers to the address: http://

Since video surveillance is a bandwidth-intensive application, behind the booming construction tide, a large number of IT systems are entering, including metropolitan area network construction, video storage system construction, video stream forwarding server construction, etc. At the same time, the construction of these systems Brought a huge and complicated system maintenance problem. From the first large-scale construction of the high-definition surveillance system in the 2010 World Expo to the large-scale high-definition surveillance system in Chongqing in 2011, the video surveillance system is under increasing pressure. These pressures are mainly reflected in:

Â· The contradiction between real-time high definition image clarity and network bandwidth;

Â· The contradiction between HD video playback resolution and video storage capacity;

Â· The contradiction between the number of HD video playback users and the throughput of the video storage system.

This series of contradictions directly puts forward higher requirements on the cost and complexity of system construction. Whether it is direct government construction or BT mode of telecom operators or video surveillance vendors, under the current international financial crisis and domestic inflationary pressures, In the fiscal sector, large sums of money have been invested in urban public security management, which is no less than a heavy economic burden for governments at all levels that are still in the development stage.

The author has many years of experience in communication and image processing. This article presents personal technical insights, hoping to promote the promotion and improvement of high-definition surveillance systems.

Option 1: Using the latest video coding technology

In the current safe city, high-definition monitoring uses a series of data processes such as encoding, transmission, decoding, and storage. In order to save investment, many use distributed storage, which means that the front-end HD video data passes NVR, H-DVR, NAS and Storage devices such as IP-SAN are centrally stored in the sub-control center. The advantage of distributed storage is that data is concentrated in each sub-center. In a large-scale system, management is convenient and reliability is good. Centralized storage is to concentrate the front-end HD video data in the storage center through NAS, IP-SAN, etc., and the security is good, but the input cost is related to the scale.

The current HD encoding mainly uses H.264 encoding. The 1080p stream is generally 8Mb/s. The hard disk capacity required for storage is 8*3600*24/8/1024=84.4GB, and the monthly is 84.4*30/1024. =2.5TB. If 40 1080p monitoring points are deployed, the video file of one month will reach 100TB! The current 48-disk array is all connected to 2T hard disk. After RAID5 is completed, the available space is not enough. Therefore, high-definition monitoring is bound to increase the cost of storage equipment.

Therefore, whether it is distributed storage or centralized storage, it is inevitable to build a huge video storage system.

IPTV video on demand achieves low stream quality

However, the current telecom IPTV video on demand application has achieved 2Mbps@720P, 4Mbps@1080P, and even some private encoding algorithms have been able to achieve 512Kbps@720P, 1Mbps@1080P ultra-low bitstream level, which is even more true than 2013. The released h.265 encoding algorithm (2Mbps@1080P) is even lower. Moreover, these video-on-demand are full-motion movie videos, and the picture complexity is much higher than that of the surveillance picture (the monitoring is generally 30% dynamic picture). So, is there any reference between the two?

It is true that the video-on-demand data of the ultra-low-stream telecom IPTV application is generally a low-stream, high-quality small-volume video file obtained after multiple compression (multipass), and these tasks are offline and non-real-time completion. of.

Real-time transcoding generally refers to real-time transcoding of high-definition programs of TV stations into private formats in order to obtain high-quality, low-stream streams and transmit them to users' homes through existing networks such as ADSL. In the above table, the full HD 1080P real-time transcoding code stream is only 2.2Mbps, this code stream is very low for our monitoring industry industry 4Mbps code stream. Considering the 30% dynamic picture of the surveillance industry, I believe that a 1 Mbps real-time transcoding stream should be achievable. Of course, for a full-motion picture such as a traffic jam, the 2.2 Mbps code stream is also quite a good result.

Of course, the above carrier-grade video compression algorithms may not run on the current TI compression processing chip at all, either on a specially designed ASIC compression chip or directly on a PC server to achieve low bitstream encoding.

Add video transcoding layer based on telecom design ideas

Based on the above technical development, the author proposes to adopt the design idea of â€‹â€‹the telecommunication system to carry out layered design for the current safe city system.

The traditional Ping An city video surveillance is generally encoded and transmitted to the sub-control center for real-time stream browsing and video storage. Divided from the system level, it can be divided into video access layer, video recording layer, video real-time display layer, and video forwarding layer.

The new design proposed by the author is to add an independent video transcoding layer compared to the traditional 4-layer planning. The video transcoding layer works by decoding the video of the video access layer and then performing secondary encoding to obtain a smaller code stream while maintaining picture quality, so as to save storage space.

According to the traditional design, the video coding layer generally outputs 2 h.264 code streams, and the first h.264 code stream is a high code stream (6-8 Mbps), which is generally used for real-time display; the second h.264 code stream Generally used for medium stream (3~5Mbps), used for video recording, mainly to reduce storage space.

In the new design, the video coding layer directly outputs the MJPEG code stream, the MJPEG code stream is relatively large, 1080P can reach 50Mbps, but MJPEG basically has no image loss, and can realize real-time display without delay in the sub-control center. . And the decoding takes up very little CPU. After receiving the MJPEG code stream, the video transcoding layer performs secondary image compression and uses the most advanced video coding technology to perform secondary encoding, thereby obtaining a very low compressed code stream while maintaining picture quality, and then The code stream is sent to the video recording layer for storage.

In the new design, the video forwarding layer forwards the low-stream video of the video recording layer to a video playback workstation or other remote workstation and decodes it using a standard h.264 decoder.

In the above design, the new video encoding layer can also support HD-CCTV cameras and directly access SDI video signals. According to the design standards of the telecommunications industry, the PC processing power required for real-time encoding will be twice that of the delayed encoding PC. Therefore, if real-time coding is used, then an I52.8GHZ PC server can handle 2 channels of 1080P video transcoding. According to Moore's Law, the performance of the CPU doubles every 18 months (the price remains the same), then considering that Intel will soon release the latest CPU series using 3D transistor technology, then if you use the i7 series of PC servers, you can turn Code 8 or so HD video. The overall cost should still be much lower than the reduced storage costs.

However, due to the direct use of MJPEG code stream, this will greatly increase the network bandwidth requirements. For the government's self-built public security network, the cheapest optical transceiver is also 100 megabit Ethernet, so the main network pressure is still in the core switch. Here, in order to adapt to the new transcoding layer, it is recommended to use multiple Gigabit aggregation layer switches, which are divided into multiple network segments to implement transcoding and then connect to the recording storage layer.

If the telecom BT construction mode is adopted, it is recommended to directly use the embedded low-stream HD encoder module to be installed in the middle of the high-definition network camera and the access network. If an HD-CCTV camera is used, it can be directly encoded by the SDI interface directly with the embedded low-stream HD encoder.

HD network cameras need a new generation of encoding compression algorithms, but h.265 is the core interest of the film and television industry. Its core technology is the compatibility of various algorithms and playback security, ensuring the interests of different patent manufacturers. Technological development and technical application have certain limitations for the security industry. Rather than waiting for the new standards to be perfected, it is better to be innovative and adopt advanced proprietary algorithms to provide low-cost solutions for security users.

Option 2: With intelligent video analysis technology

In the above, the author has proposed that the video picture of the surveillance industry is characterized in that 30% of the pictures are dynamic pictures (human/vehicle movement), and the other 70% are basically secondary pictures (background trees, flowers, small animals). For video storage, if you only record the person/car picture, it will save a lot of storage space.

Identify important pictures

In the 30% dynamic picture, only 20% of the motion pictures may be of interest to us (interest area), and the other 80% of people/car movements do not affect our safety (outside the area of â€‹â€‹interest). Therefore, if we use intelligent video analysis to identify important images, then we can save 94% of the high bitstream space. Considering the accuracy of the intelligent video analysis algorithm identification, we can use low for these 94% of the secondary images. Frame rate + high bit stream for recording (depending on the scene, low-stream + high frame rate can be considered for applications with high real-time picture requirements).

Of course, most of the current cameras have motion detection function. Is it possible to use the free motion detection function to complete the above functions? In my opinion, not only the camera's own motion detection function cannot be implemented by the author. The above assumptions, even most of the intelligent video analysis products on the market can not achieve the above design goals, why?

Talking about intelligent video analysis, we have to review the technical core of intelligent video analysis. Intelligent video analysis is generally composed of four parts: picture segmentation (target detection), front background separation (target tracking), target classification, and target recognition. The current VMD motion detection can only achieve the layer splitting, and can not suppress a large number of false positives. Advanced VMD motion detection enables target tracking, which greatly reduces false alarms caused by outdoor environments. By manually calibrating the depth of field, using the target pixel size to filter out some unrealistic targets, such as small insects crawling fast on the lens; more advanced VMD motion detection, using a variety of rules such as the minimum moving speed of the target, The minimum moving pixel of the target, the time of occurrence of the target, the contrast of the target, the minimum pixel of the target, and the like, and the target is simply classified, such as an excessively large pixel, a target of too fast speed, and a vehicle.

Some advanced VMD motion detection algorithms can achieve good results through complicated settings, but they cannot automatically adapt to weather and environmental changes. Therefore, every season changes, you need to reset them, which brings challenges to system maintenance. .

In fact, the screen segmentation, the front background separation, and the target classification are not one-way data streams, and the target classification in turn will affect the quality of the screen segmentation. After all, as long as there is pixel motion, the screen segmentation will isolate these pixels. Before the background separation, the amount of computation is needed to track the motion of these clusters to distinguish whether it is noise, whether it is a shaking branch, or whether the human target is moving.

In addition, for the traditional intelligent video image segmentation algorithm, it is a simple high-computation work on the DSP. As shown in (Figure 1), such a large number requires a powerful DSP to complete, which is why most of the current market. The intelligent video analysis algorithm only computes the CIF format picture of 352x288, because once the 4CIF picture is calculated, the DSP has no computing power to perform video coding.

For HD, the video data of 1080P full HD picture will be as high as 500Mbps, which is insurmountable for manufacturers using traditional intelligent video analysis algorithms.

The current intelligent video analysis algorithm is a recognition mode that simulates the human brain. The human eye does not judge the target category by the size of the target, but judges by the characteristics of the target. For example, suppose that the human eye sees that a person's lower body is blocked by the car. With only the upper body, the human eye can completely judge that this is a person's target hidden behind the car, and the ordinary advanced VMD motion detection relies on such a small number of pixels to conclude that this is not People goal.

_{Target classification by image segmentation}

The author understands that the latest video intelligence technology based on neuron artificial intelligence algorithm of VideoIQ in the United States can greatly suppress the false positives caused by nature through the built-in more than 200 million target modes and powerful artificial intelligence neural learning algorithms, such as rain and snow. , wind, small animals, birds, light and shadow changes, branches and so on. Advanced motion detection simply determines the target based on pixel size, regardless of the target's appearance texture, color, geometry, pace, and other human/vehicle modes, which will result in high false positives. For example, in summer, flying insects and insects fly in front of the camera, and dew and raindrops slide over the surface of the lens. These will cause target classification errors according to the pixel size, resulting in a large number of false positives and wasted video space.

In the image segmentation stage, the initial target classification technique is used to complement each other, which greatly reduces the amount of DSP operation of image segmentation, so that a 1080p resolution intelligent video analysis and 1080P image coding compression can be realized by using a DaVinci DSP.

Once this advanced target classification algorithm is available, the HD camera has the ability to only return important HD streams for recording. When there is no important picture, the low frame rate stream is returned to ensure 1080P picture clarity. The best way to use this algorithm is to directly embed the algorithm in the front-end smart camera, which simplifies system management.

If the front-end camera is a normal network HD camera, you can also use the PC to implement intelligent video analysis at the back end to achieve the same video effect.

Conclusion

The construction of a safe city is a new thing. Even in Europe and the United States, there are not many cases to refer to. Our domestic monitoring manufacturers should work together to create China's own technologies and standards, so that they can be built in the process of building a safe city in China. Develop new technological capabilities and push them to the world.

Cute Fan

Dongguan Deli Plastic Co.,Ltd is a manufacturer specialized in the research, development ,plastic injection mould and making mass production with well-equipped facilities and strong technical force.

Our products are extensively used in household industry/electronic industry/automobile industry/building industry and other industries.

We have rich experience on one-stop solution, provide various services from new product design,prototype,mold making,mass production,assembly and logistics. The most important advantage is we have our own R&D team to help clients to turn ideas into actual parts. All of these engineers and designers have over 15 years experience in these plastic products fields.

We have a strict quality control system, an excellent management team and also a dedicated sales force, enable us to fulfill our commitment in high quality products and outstanding services.
If you are looking for a trustworthy supplier of customized items, please do not hesitate to contact us. We are always striving to establish a win-win partnership with customers from all over the world and help our partners to stay one step in front of your competitors.

Cute Fan,Cute Usb Fan,Cute Mini Handy Fan,Desktop Cute Small Fan

Guangdong Aiyimi Electronic Technology Co., Ltd. , https://www.nbminifan.com