Support non-identical file names between .wav and .eaf, and recognise media offsets#215
Conversation
nicklambourne
left a comment
There was a problem hiding this comment.
I think a big part of the original tickets was implementing a UI feature that would highlight (in particular) if audio or eaf were uploaded without the corresponding eaf or audio file (respectively). The easiest way I can envisage to accomplish this is aligning the audio files horizontally in the UI with their transcriptions, which would make it obvious that a pair was missing either component (you could also highlight rows with a missing file, or something to that effect). The unfortunate downside to this is that you'll have to replicate the verification on the front end (This might help: https://www.npmjs.com/package/elan-parser ).
Ah okay, this might be a bit more work to do on the uploading side of things, as it won't just be a file drop anymore. But I can look into it 👌 |
This reverts commit 09247da.
Resolves #191, #193.
This implementation doesn't give the user any choice as to whether to match the file name of the corresponding
.eaffile or to just get it fromRELATIVE_MEDIA_URL. It defaults to the former behaviour and falls back to the later.This implementation also ignores
MEDIA_URLas it is difficult to wrestle it (e.g."file:///Users/bbb/Desktop/abui/abui-audio-1.wav") into a format that the rest of the application will be able to handle easily. In other words, it assumes thatRELATIVE_MEDIA_URLis well formed.This also fixes any
line = wer_lines[0] IndexError: list index out of rangeerrors that may have been happening before, although please double check they are actually fixed.Offsets are directly
int()-ed from the.eaffile.