Speech recognition: Difference between revisions
Marked this version for translation |
No edit summary |
||
| Line 2: | Line 2: | ||
<translate> | <translate> | ||
== BAS Web Services== <!--T:10--> | This page contains information on Dutch speech recognition systems. | ||
==Online services== | |||
=== BAS Web Services=== <!--T:10--> | |||
<!--T:11--> | <!--T:11--> | ||
| Line 16: | Line 20: | ||
<!--T:1--> | <!--T:1--> | ||
==LaMachine webservices== | ===LaMachine webservices=== | ||
There are several speech recognition [https://webservices.cls.ru.nl/ web services] at Radboud University | There are several speech recognition [https://webservices.cls.ru.nl/ web services] at Radboud University | ||
==Speech Recognition for Belgian Dutch: NeLF== <!--T:2--> | ===Speech Recognition for Belgian Dutch: NeLF=== <!--T:2--> | ||
<!--T:13--> | <!--T:13--> | ||
| Line 31: | Line 35: | ||
<!--T:16--> | <!--T:16--> | ||
==HENSOLDT ANALYTICS Speech-to-text for Dutch== | ===HENSOLDT ANALYTICS Speech-to-text for Dutch (demo)=== | ||
The [https://european-language-grid.eu European Language Grid] hosts this speech recognition service with demo at | The [https://european-language-grid.eu European Language Grid] hosts this speech recognition service with demo at | ||
[https://live.european-language-grid.eu/catalogue/tool-service/ | [https://live.european-language-grid.eu/catalogue/tool-service/23090/try%20out/ https://live.european-language-grid.eu/catalogue/tool-service/23090/try%20out/] | ||
<!--T:9--> | |||
===Microsoft Transcriber=== | |||
* in Word 365 | |||
*[https://support.microsoft.com/nl-nl/office/uw-opnamen-transcriberen-7fc2efec-245e-45f0-b053-2a97531ecf57 Website in Dutch] | |||
==To install== | |||
===noScribe=== | |||
== | |||
*AI-based software that transcribes interviews for qualitative social research or journalistic use | |||
* | *free and open source (GPL-3.0) | ||
*[https://github.com/ | *runs completely local on your computer | ||
* can distinguish different speakers and understands around 60 languages | |||
* includes a nice editor to review, verify and correct the resulting transcript | |||
* standing on the shoulders of giants: Whisper from OpenAI, faster-whisper by Guillaume Klein and pyannote from Hervé Bredin | |||
* [https://github.com/kaixxx/noScribe Github page] | |||
===Whisper model from OpenAI=== | |||
==Whisper model from OpenAI== | |||
ASR for multiple languages, including Dutch is available from Whisper. Full model download is possible. | ASR for multiple languages, including Dutch is available from Whisper. Full model download is possible. | ||
| Line 52: | Line 65: | ||
*[https://www.youtube.com/watch?v=ABFqbY_rmEk YouTube video] explaining how to install whisper on your windows machine | *[https://www.youtube.com/watch?v=ABFqbY_rmEk YouTube video] explaining how to install whisper on your windows machine | ||
<!--T: | |||
== | <!--T:5--> | ||
*[https:// | ==Punctuation Insertion== | ||
AS ASR output often consists of streams of words, you may want to automatically insert punctuation. | |||
<!--T:6--> | |||
*[https://huggingface.co/oliverguhr/fullstop-dutch-sonar-punctuation-prediction?text=hervatting+van+de+zitting+ik+verklaar+de+zitting+van+het+europees+parlement+die+op+vrijdag+17+december+werd+onderbroken+te+zijn+hervat HuggingFace model] | |||
*[https://github.com/VincentCCL/Segment_FullStop/blob/main/Segment_FullStop.py Python script that accepts txt file as input and returns punctuated txt as output] | |||
</translate> | </translate> | ||
Revision as of 11:23, 13 October 2025
This page contains information on Dutch speech recognition systems.
Online services
BAS Web Services
The BAS Web Services are a rich set of tools for speech sciences and technology. Tools include:
- Automated speech recognition, including several models for Dutch
- Anonymizer
- Audio segmentation tool on the basis of transcripts
- Speaker diarisation
- Voice activity detection
- Webinterface (requires CLARIN login)
LaMachine webservices
There are several speech recognition web services at Radboud University
Speech Recognition for Belgian Dutch: NeLF
API and browser access to a state-of-the-art speech recognition system for Belgian Dutch, including dialect speech recognition, developed by KU Leuven and UGent.
Requires a login which can be requested, but you have to await manual approval.
HENSOLDT ANALYTICS Speech-to-text for Dutch (demo)
The European Language Grid hosts this speech recognition service with demo at https://live.european-language-grid.eu/catalogue/tool-service/23090/try%20out/
Microsoft Transcriber
- in Word 365
- Website in Dutch
To install
noScribe
- AI-based software that transcribes interviews for qualitative social research or journalistic use
- free and open source (GPL-3.0)
- runs completely local on your computer
- can distinguish different speakers and understands around 60 languages
- includes a nice editor to review, verify and correct the resulting transcript
- standing on the shoulders of giants: Whisper from OpenAI, faster-whisper by Guillaume Klein and pyannote from Hervé Bredin
- Github page
Whisper model from OpenAI
ASR for multiple languages, including Dutch is available from Whisper. Full model download is possible.
- Webpage
- Github page
- YouTube video explaining how to install whisper on your windows machine
Punctuation Insertion
AS ASR output often consists of streams of words, you may want to automatically insert punctuation.