Natural spoken dialogue system for intelligent hardware | Shanghai Jiaotong University

Natural spoken dialogue system for intelligent hardware

Intelligent hardware multi-scene non-cooperative natural spoken interaction solution.

Type

Intelligent dialogue platform

Tags

Other resource gains

Artificial intelligence

Voice print recognition

Dialog system

Semantic theory;

Voice wake-up

Speech recognition

Solution maturity

Mass promotion / Mass production

Cooperation methods

Joint venture cooperation

Applicable industry

Information transmission, software and information technology services

Applications

Smart device

Key innovations

The innovation of this project is to develop a non-cooperative perception-cognitive full-link natural spoken interaction system, which achieves major breakthroughs in anti-noise recognition, offline wake-up, semantic understanding and customized architecture, and greatly improves the performance of intelligent hardware interaction.

Potential economic benefits

Through large-scale application and high market share, the project has added an additional output value of 4.7 billion yuan in the past three years, registered more than 50 million intelligent hardware terminals, and has significant economic benefits.

Potential climate benefits

This technology effectively reduces energy consumption in smart hardware and the cloud through low resource offline processing and low computing complexity, thereby indirectly reducing power consumption and carbon emissions.

Solution supplier

Shanghai Jiaotong University

Shanghai Jiao Tong University is a top university in China, committed to cultivating outstanding talents, leading scientific and technological innovation, and serving national strategic development.

Shanghai，China

Solution details

This project belongs to the field of artificial intelligence. With the rapid popularization of intelligent hardware in recent years, natural spoken interaction with voice as the main channel is becoming the most convenient way of human-computer communication. Although cooperative near field speech recognition has been used in the industry, it cannot meet the non-cooperative understanding interaction needs of intelligent hardware and has become a bottleneck in industrial applications. This project carries out the research and development of non-cooperative perception-cognitive full-link natural spoken interaction system technology. Through joint research and research, industry-university-research, it has successfully developed a scene-based natural spoken dialogue system solution for intelligent hardware. The main innovation results are as follows: 1. Anti-noise speech recognition technology in complex interactive scenarios with intelligent hardware. Aiming at the accuracy, speed and big data training issues of non-cooperative speech recognition in complex scenes, a new deep learning anti-noise model, a phoneme synchronous fast decoding algorithm and a deep learning parallel training acceleration algorithm are proposed. Obtained the lowest recognition error rate in the international standard anti-noise recognition test set, the speech recognition search speed has been improved by more than 20 times, and tens of thousands of hours of speech data have been completed in a single day. On the hardware side, we have developed audio transmission and microphone array signal processing technology and hardware modules that adapt to multiple types of intelligent hardware to achieve high-precision far-field sound source tracking and positioning. 2. Low-resource offline voice wake-up and ultra-short-time Voice Print Recognition technology. Aiming at the problems of non-networking, low-resource voice wake-up and ultra-short-time Voice Print Recognition in complex scenes of intelligent hardware, a new deep-feature Voice Print Recognition and unlimited voice wake-up algorithm with low computing complexity are proposed, which greatly improves the accuracy and computing speed of wake-up and voice-print verification of small hardware devices. 3. Scalable robust semantic understanding and fault-tolerant dialogue technology. Aiming at the problem of unstable understanding caused by changeable semantic domains, recognition errors and understanding ambiguity in task-based dialogue, a semantic understanding and dialogue state tracking framework driven by knowledge and data is proposed to achieve high-precision semantic understanding and dialogue state tracking under insufficient data. Status tracking meets the needs of rapid expansion of the semantic domain. Invented the fault-tolerant and error-correcting technology for spoken dialogue systems and was the first in the industry to implement it in a vehicle-mounted spoken dialogue system. 4. Loose coupled task-based dialogue system architecture and dialogue system customization technology. In response to the large-scale personalized customization of the full-link natural spoken dialogue system, the loosely coupled task-based dialogue system architecture was invented and the country's first software and hardware integrated ldquo; cloud + terminal rdquo; full-link task-based spoken dialogue system customization platform. In order to improve the voice interaction performance of customized systems, a series of model adaptive technologies for speech recognition and semantic understanding have been developed, and customized systems and new interactive input methods for language models and dialogue skills have been invented. The project has obtained 44 nationally authorized invention patents and 15 software copyrights, forming 2 national standards. Relevant achievements have been used on a large scale in fields such as smart vehicles, smart homes, and smart robots. In terms of smart vehicles, the results have been applied to smart vehicle-mounted terminal products, including front-mounted products such as SAIC, Geely, Dongfeng, Wuling, and FAW, as well as rear-mounted products such as 360 driving recorders; in terms of smart homes, the results have been applied to Changhong, Haier, Siemens, Hisense and other major appliance and kitchen and bathroom brands, as well as smart slightly such as Alibaba, Tencent, and Lenovo, have a market share of over 80% of the smart slightly market; In terms of intelligent robots, the results have been applied to robots from brands such as Tencent, Jingdong, and Xiaobawang. A total of more than 50 million smart hardware terminals have been registered, with an added output value of 4.7 billion yuan in the past three years.

Last updated

11:53:22, Nov 04, 2025

Information contributed by

Green Technology Bank

See original page on

Report