Mp4grep - a CLI for transcribing and searching audio/video files

Related tags

CLI mp4grep
Overview

mp4grep

mp4grep is a tool that transcribes and searches audio and video files for a regex pattern. mp4grep isn't just for mp4 files! It also supports mp3, mp4, ogg, webm, mov, and wav.

Screenshots

search transcribe help

Compatible transcription models

mp4grep depends on Vosk to transcribe audio. By default, mp4grep ships with a 40 MB lightweight English model. If you want to transcribe dialects of English or other languages with accuracy, you will need to use a different model (specify with --model). You can download other models from Vosk's official list.

Installation

  1. Download mp4grep and unzip it in the location that you want to install it: unzip mp4grep-v0.1.0.zip

  2. Add mp4grep to your PATH, and set its environment variables:

cd mp4grep-v0.1.0

source install.sh

This script adds mp4grep to your path and sets some environment variables.

  1. Use mp4grep to search! mp4grep "the birch canoe" harvard_sentences.mp4

Pull requests

Pull requests are welcome. Please open a pull request if you have a bug to fix or a cool idea.

Platforms

mp4grep currently supports Linux.

Comments
  • Vosk MalformedJsonException

    Vosk MalformedJsonException

    Hi there, I struggle on the usage of your tool. After the installation of the last release, I encounter the following error mp4grep --model $MP4GREP_MODEL --transcribe mytestfile.mp3

    Exception in thread "Thread-2" com.google.gson.JsonSyntaxException: com.google.gson.stream.MalformedJsonException: Expected ':' at line 3 column 25 path $.result[0].861850
    	at com.google.gson.Gson.fromJson(Gson.java:947)
    	at com.google.gson.Gson.fromJson(Gson.java:897)
    	at com.google.gson.Gson.fromJson(Gson.java:846)
    	at com.google.gson.Gson.fromJson(Gson.java:817)
    	at Transcribe.VoskAdapter.getJsonObject(VoskAdapter.java:163)
    	at Transcribe.VoskAdapter.writeToCacheFiles(VoskAdapter.java:139)
    	at Transcribe.VoskAdapter.writeMainRecognizerResults(VoskAdapter.java:120)
    	at Transcribe.VoskAdapter.transcribeAudio(VoskAdapter.java:40)
    	at Transcribe.VoskProxy.getSearchableTranscript(VoskProxy.java:14)
    	at Transcribe.Cache.TranscriptCache.getSearchable(TranscriptCache.java:43)
    	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
    	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
    	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
    	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:960)
    	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:934)
    	at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
    	at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
    	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
    	at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:686)
    	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:927)
    	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
    	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
    	at Transcribe.TranscriptAdapter.lambda$getTranscription$1(TranscriptAdapter.java:140)
    	at java.base/java.lang.Thread.run(Thread.java:833)
    Caused by: com.google.gson.stream.MalformedJsonException: Expected ':' at line 3 column 25 path $.result[0].861850
    	at com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1562)
    	at com.google.gson.stream.JsonReader.doPeek(JsonReader.java:529)
    	at com.google.gson.stream.JsonReader.peek(JsonReader.java:424)
    	at com.google.gson.internal.bind.TypeAdapters$29.read(TypeAdapters.java:700)
    	at com.google.gson.internal.bind.TypeAdapters$29.read(TypeAdapters.java:723)
    	at com.google.gson.internal.bind.TypeAdapters$29.read(TypeAdapters.java:715)
    	at com.google.gson.internal.bind.TypeAdapters$29.read(TypeAdapters.java:723)
    	at com.google.gson.internal.bind.TypeAdapters$29.read(TypeAdapters.java:698)
    	at com.google.gson.internal.bind.TypeAdapters$35$1.read(TypeAdapters.java:894)
    	at com.google.gson.Gson.fromJson(Gson.java:932)
    	... 24 more
    

    I tried different audio files and different models, always with this error. If I call the tool again, it just raises 100% transcription, but no text is plottet.

    opened by Matthias84 11
  • Issue when installing

    Issue when installing

    im using debian 11 and ive downloaded 0.1.1 and extracted it. i cd into the folder and did source install.sh pls see below the output

    readlink: invalid option -- 'b'
    Try 'readlink --help' for more information.
    Completed environment setup for mp4grep:
    MP4GREP_CACHe=./.mp4grep_cache
    MP4GREP_MODEL=./model
    PATH=$PATH:./bin
    
    Variable exported in ~/.bashrc
    

    i tried running mp4grep and it says command not found

    opened by JunkyardCat 5
  • mp4grep won't accept a directory

    mp4grep won't accept a directory

    Hello, great tool, I really like it. This is maybe user error, but I'm running into a problem when feeding it a directory:

    $ ls test
    test1.wav*  test2.wav*  test3.wav*
    
    $ mp4grep "test" test/
    test3.wav does not exist, not transcribing.
    test2.wav does not exist, not transcribing.
    test1.wav does not exist, not transcribing.
    

    If I point it directly to one of the files, it works as expected, but pointing it to a directory doesn't seem to work.

    I encounter the same behavior with the prebuilt binary as well as one I build myself. I'm using version 0.1.4 on Arch.

    opened by jroneilky 2
  • Make install.sh detect shell type

    Make install.sh detect shell type

    Heard about this project on Linux Unplugged, wanted to give it a try. I use zsh, and the install.sh butchered my prompt by trying to source ~/.bashrc into a zsh session. I re-wrote the script to detect if the user is running bash or zsh and change the output file accordingly, but this install.sh doesn't seem to be in the git tree anywhere so I wasn't able to make a pull request.

    Also, before editing a user's bashrc/zshrc we should give them the opportunity to do it manually if they wish. I know a lot of people (myself included) have a lot of custom stuff in their shell startup and the thought of an automated script messing with it always makes me nervous. (Yes, this script just appends lines to the end, but still.)

    Here's what I wrote, it could definitely be improved upon but I wanted to at least pass on the two minutes of work I did!

    #!/bin/sh
    
    SCRIPT_DIRECTORY=$(dirname "$(readlink -f "$0")")
    
    if [ "$SHELL" = "/bin/bash" ]; then
        RCFILE=~/.bashrc
    elif [ "$SHELL" = "/usr/bin/zsh" ]; then
        RCFILE=~/.zshrc
    else
        RCFILE=~/.profile # if we can't detect the shell, use a sane default that all modern shells look at
    fi
    
    echo "This installation script will try and set things up automatically."
    echo "In order to do so, some environment variables will be appended to $RCFILE"
    echo "If you want to do this manually, hit CTRL-C now and examine this script."
    echo "Otherwise, just hit enter and we'll do the work for you!"
    read wait # just wait for the user to read
    
    echo "" >> $RCFILE # in case the rcfile doesn't end with a newline, forcefully add one
    echo "# mp4grep environment variables -------" >> $RCFILE
    echo "export MP4GREP_CACHE='$SCRIPT_DIRECTORY/.mp4grep_cache'" >> $RCFILE
    echo "export MP4GREP_MODEL='$SCRIPT_DIRECTORY/model'" >> $RCFILE
    
    echo -n 'export PATH="$PATH:' >> $RCFILE
    echo "$SCRIPT_DIRECTORY/bin\"" >> $RCFILE
    echo "# --------------------------------------" >> $RCFILE
    
    echo "Completed environment setup for mp4grep: "
    echo "MP4GREP_CACHE=$SCRIPT_DIRECTORY/.mp4grep_cache"
    echo "MP4GREP_MODEL=$SCRIPT_DIRECTORY/model"
    echo -n 'PATH=$PATH:'
    echo "$SCRIPT_DIRECTORY/bin"
    echo ""
    echo "Variables exported in $RCFILE"
    
    source $RCFILE
    
    opened by MrAureliusR 2
  • Timestamp mismatch

    Timestamp mismatch

    Whilst transcribing a 00:05:34:09 long video, the last transcribed line had a 00:05:25:47 timestamp. In other words, a 5 and a half hour long video had its transcribtion timestamps delayed more and more as the time went on.

    Video: 00:00:10:50, Timestamp 00:00:10:35 Video: 00:00:38:03, Timestamp 00:00:37:04 Video: 00:01:39:41, Timestamp 00:01:37:12 Video: 00:03:45:28, Timestamp 00:03:39:50 Video: 00:05:34:09, Timestamp 00:05:25:47

    Based on last results: ((5*60+34)60+9)/((560+25)*60+47)=1.02568169028 times increase in timestamp clock (approximately).

    I am using WSL Ubuntu 20.04, as well as the "vosk-model-ru-0.22" Russian 1.5Gb Vosk Model.

    opened by Imforpeace 1
  • Create releaseApproval.yml

    Create releaseApproval.yml

    Adds a yml file to execute an approval workflow whenever a pull request or push is made to the "release" branch. This workflow can be added as a action in the protected branch settings.

    opened by dev-dwarf 0
  • FYI: Memory usage

    FYI: Memory usage

    I monitored around 13GB RAM needed to use the 1.8GB vosk-model-en-us-0.22 to process a 209MB wav file with roughly 2 hours of audio.

    Until I upped the VM's RAM I couldn't get it to run without 'Segmentation fault'-ing. It didn't work well with Hyper-V's dynamic memory either, the process was always killed before memory was ballooned up to what was needed.

    With the RAM pre-allocated it didn't have any problem, just took a while with no indication of progress apart from occasional 'will do an exact optimization' messages.

    Thanks for the work done to create this tool!

    opened by jspraul 2
  • FYI: ocaml-multicore does not currently support arm64

    FYI: ocaml-multicore does not currently support arm64

    mp4grep does not support arm64 until https://github.com/ocaml-multicore/ocaml-multicore/issues/86 is resolved.

    We will take up arm64 after we hit OCaml 5.0 MVP.

    opened by jspraul 0
  • Feature speaker detection?

    Feature speaker detection?

    HI, I really appreciate your tool. It's such a great solution to make recordings more accessible for further investigations :smiley:

    I read that Vosk has also a speaker identification / detection and I'm wondering, if you could add this to mp4grep as well? For myself there are a lot of nice usecases to track / analyse discussions (TV shows, movies, phone recordings, podcasts, web conferences, ...) and that allow great research like NLP or knowledge base and making multimedia content more accessible to users with handicaps. Done with privacy in mind and not contributing to major tech company algorithms.

    My understanding so far is, that Vosk needs fingerprinting for different speakers and maybe multiple fingerprints per person. So we will need a way to assign lines within a transcription to fingerprinted speakers and to label this fingerprints with human readable labels. In a second step, there might be a final processing, that assigns this labels to every transcription line. Maybe we need also an extended transcription format like WebVTT to share this assigned lines and timecodes?

    opened by Matthias84 1
  • Doesn't work on macOS

    Doesn't work on macOS

    Hello ! When I run this program with a mp3 file I get the following error : ws.schild.jave.EncoderException: java.io.IOException: Cannot run program "/var/folders/gj/7wnvhh_92fn2cs9thjy0t0y40000gn/T/jave/ffmpeg-x86_64-3.2.0-osx": error=2, No such file or directory

    opened by b5i 1
Releases(0.1.5-linux-x86)
  • 0.1.5-linux-x86(Jan 4, 2023)

    Small fixes, mainly that a directory can be passed as an argument and all 16000 Hz wav files within will be transcribed concurrently. Install and build instructions are the same as the previous release.

    Source code(tar.gz)
    Source code(zip)
  • 0.1.4-linux-x86(Apr 23, 2022)

    Instructions for building mp4grep have not changed since the previous release (see https://github.com/o-oconnell/mp4grep/releases/tag/0.1.3-linux).

    However, this release adds some small improvements and also contains a pre-built binary of mp4grep for x86 Linux, which means you do not need to install the OCaml compiler if you only want to install mp4grep.

    Installation steps, if you do not want to build:

    1. Download and decompress the code, then enter the resulting directory.

    2. (Optionally) set your $PATH to bin/mp4grep (it must be in this directory relative to the Vosk .so under lib).

    3. Set your MP4GREP_CACHE and MP4GREP_MODEL environment variables to a convenient directory for text files and a Vosk model, respectively (see https://alphacephei.com/vosk/models for model downloads).

    4. If you do not already, make sure you have ffmpeg and gcc installed. This has been tested on x86 Lubuntu 21.10 and x86 Ubuntu 20.04.

    After that, you should be good to go! Open an issue if you're having problems, and remember this has only been compiled for x86 Linux. You may need to refer to the build instructions (https://github.com/o-oconnell/mp4grep/releases/tag/0.1.3-linux) for other architectures.

    Source code(tar.gz)
    Source code(zip)
  • 0.1.3-linux(Mar 22, 2022)

    mp4grep in OCaml. Installation and dependencies:

    Installation

    1. Install Opam

    2. Create a new switch (version of the compiler) by running opam switch create mp4grep 4.12.0+domains+effects. This version is necessary as mp4grep will be updated to take advantage of multicore domains in OCaml.

    3. Install parmap on your new switch: opam install parmap.

    4. Download mp4grep, untar/zip it, and cd into its directory.

    5. Execute source configure.sh --prefix [location to install mp4grep]. On Linux a good choice might be ~/.local/bin, which is often in your $PATH.

    6. Run make install.

    7. You will need to specify your Vosk-compatible transcription model and directory for cached transcriptions by setting MP4GREP_MODEL and MP4GREP_CACHE. You'll probably want to export them in your .bashrc or .zshrc as well: export MP4GREP_MODEL=/full/path/of/model, export MP4GREP_CACHE=/full/path/of/cache/dir.

    mp4grep-convert

    The mp4grep executable only takes single-channel, 16000 Hz wav files as input. Running make install also provides you with mp4grep-convert, which is a Bash script that will take directories or audio files as its arguments, extract audio files from directories, and convert them to wav files using ffmpeg.

    Dependencies

    OCaml 4.12.0+domains+effects, parmap 1.2.4, and ffmpeg. The Makefile assumes that you have installed parmap using Opam, and looks under OPAM_INSTALL_PREFIX for it. You will have to modify the topmost ocamlc command in the Makefile if you have installed it another way.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Nov 29, 2021)

    Fixed timestamp formatting issue. Added --clear-cache option. Added --transcribe and --transcribe-to-file options for printing transcriptions without search.

    Download mp4grep-0.1.1.zip. Run unzip mp4grep-0.1.1.zip. Then, cd mp4grep-0.1.1 and source install.sh.

    This and prior versions of mp4grep depend on Java 11+

    Source code(tar.gz)
    Source code(zip)
    mp4grep-0.1.1.zip(92.31 MB)
  • 0.1.2(Nov 10, 2021)

Owner
ooc
ooc
Tiny logging wrapper dedicated for CLI-oriented applications

Dynamic Logger Tiny logging wrapper dedicated for CLI oriented applications with non-static logger that require dynamic threshold/level changes, progr

Dzikoysk 3 Sep 25, 2022
A CLI to lock the terminal while being afk.

LockCLI A CLI to lock the terminal while being afk. Usage: [lock, unlock] LockCLI from Source Have JDK 8+ Installed Compile main class and run in any

null 4 Dec 17, 2021
CLI for NubesGen

CLI for NubesGen This CLI is part of the NubesGen project. It automates a project configuration: on simple projects, running this command line should

Julien Dubois 12 Apr 13, 2022
httpx is a CLI to execute requests from JetBrains Http File.

httpx: CLI for run http file httpx is a CLI to execute requests from JetBrains Http File. How to use? Create index.http file with following code, then

Libing Chen 106 Dec 29, 2022
Several implementations of a text table, originally using ASCII and UTF-8 characters for borders.

ASCII Table ASCII table - A simple tool to format tables with various row/column options for indentation, indentation character, alignment, padding (l

Sven van der Meer 415 Dec 26, 2022
Wrapper around ping command for Windows and MacOS

Wrapper around ping command for Windows and MacOS. Extended with functionality to intercept results provided by the ping command output (latency, ttl and etc.)

Vladislav Kozlov 1 Jan 6, 2022
TransitScheduler - a command line tool that can read .json data formulated for tracking transit patterns to a multithreaded concurrent simulation of passengers boarding and unboarding trains that constantly move to the next station on the line. The trick here, is that two trains cannot occupy the same station at any time.

TransitScheduler - a command line tool that can read .json data formulated for tracking transit patterns to a multithreaded concurrent simulation of passengers boarding and unboarding trains that constantly move to the next station on the line. The trick here, is that two trains cannot occupy the same station at any time.

Emmet Hayes 1 Dec 2, 2022
The lightweight library for compress image, video, and audio with an awesome experience

Would you like to support me? react-native-compressor Compress videos, images and audio before upload react-native-compressor package is a set of func

Shobbak 265 Jan 1, 2023
Simple full text indexing and searching library for Java

indexer4j Simple full text indexing and searching library for Java Install Gradle repositories { jcenter() } dependencies { compile 'com.haeun

Haeun Kim 47 May 18, 2022
AWS JSON TRANSLATOR CLI is a command line application to translate JSON files using AWS Translate

A command line tool to translate JSON files using AWS Translate.

Marc Guillem 0 May 30, 2022
LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan), audio (OpenAL), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR) applications.

LWJGL - Lightweight Java Game Library 3 LWJGL (https://www.lwjgl.org) is a Java library that enables cross-platform access to popular native APIs usef

Lightweight Java Game Library 4k Dec 29, 2022
LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan), audio (OpenAL), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR) applications.

LWJGL - Lightweight Java Game Library 3 LWJGL (https://www.lwjgl.org) is a Java library that enables cross-platform access to popular native APIs usef

Lightweight Java Game Library 4k Dec 29, 2022
A Flutter plugin to extract waveform data from an audio file suitable for visual rendering.

just_waveform This plugin extracts waveform data from an audio file that can be used to render waveform visualisations. Usage final progressStream = J

null 53 Dec 4, 2022
A maven plugin to include features from jmeter-plugins.org for JMeterPluginsCMD Command Line Tool to create graphs, export csv files from jmeter result files and Filter Result tool.

jmeter-graph-tool-maven-plugin A maven plugin to create graphs using the JMeter Plugins CMDRunner from JMeter result files (*.jtl or *.csv) or using F

Vincent DABURON 6 Nov 3, 2022
Student Result Management System - This is a CLI based software where the Software is capable of maintaining and generating Student's Result at the end of a semester after the teacher's have provided the respective marks.

Student Result Management System This is a CLI based software where the Software is capable of maintaining and generating Student's Result at the end

Abir Bhattacharya 3 Aug 27, 2022
Drifty is an open-source interactive File Downloader system built with java. It is currently available in CLI mode and has the GUI version under active development.

Drifty Drifty is an open-source interactive File Downloader system built using Java. It takes the link to the file, the directory where it needs to be

Saptarshi Sarkar 60 Dec 24, 2022
This Web Application Allows A user to upload a two minutes Video. It uses Server Side Capabilities of Nodejs and Spring Boot .

VideoStreamingApplication Purpose Of This Application These days trend of short videos are on rise youtube recently realsed "Shorts" . So , taking ins

Prateek Kumar 57 Nov 13, 2022
A well-designed local image and video selector for Android

Matisse Matisse is a well-designed local image and video selector for Android. You can Use it in Activity or Fragment Select images including JPEG, PN

Zhihu 12.4k Dec 29, 2022
Presti 5 Nov 19, 2022