-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I also use mp3 files for training instead of wav files? #53
Comments
Hello @Jochen-sys I think that's not possible. |
@Jochen-sys because the spectograms are made out of the wav files |
@CracKCatZ Thanks for the quick answer. |
@Jochen-sys hmmm good question I actually don't know if this is possible but I think yes you can do one part with tensorflow and the other with pytorch you just need to fed the spectograms some how into pytorch. |
@Jochen-sys but why would you want to use mp3 files instead of wav files it's much easier to handel and format them:) |
@CracKCatZ First of all I don't have so much storage on my computer. |
@Jochen-sys mp3 files are like any normal files, wav files(wave files ) are constructed different they look quite different too because every sound is displayed as a wave, mp3's on the other hand not. I don't know how it would change the performance of the model or the training. U can add me on discord: SheeeshForce#8083 |
@CracKCatZ Thanks for explaning. I wanted to test the engine.py, but I got an error "ImportError: cannot import name 'imsave'". imsave is from scipy.misc and I found out that stackoverflow means it should be imageio. Now I'm confused, because I think it should work with imsave?! Could you help me out there please? |
@Jochen-sys I am actually not familiar with imsave and imageio |
@CracKCatZ Ok but can you run engine.py without problems? |
@Jochen-sys at the moment not because for installing the ctcdecoder I have to switch to Linux. |
@CracKCatZ Ok I'm sorry I'm an idiot, I fixed it. My problem was that I thought neuralnet would be a regular pypi package and not a special self programmed one. Why did you name scripts or folders like other existing packages on pypi :-) (there is sadly no smiley which is laughing)? |
@CracKCatZ Where exactly will the spectrograms be produced? There is so much code with spectrograms, I don't find the exact one. |
@CracKCatZ Which version of ctcdecode do you use? (Mine worked a few days ago, but than it failed) |
@Jochen-sys I don't was able to test ctcdecode yet |
@CracKCatZ Ok got it. |
Hey @Jochen-sys yes you have:) The model will be saved automatically as a ckpt file:) Yes I think that you need one too(btw I need also one ) because I think without the zip we get no outputs. Could you please add me on discord please so we could talk there and speed up communication? Name:SheeeshForce1#8083 |
Sorry I don't have discord. Does anyone now what this means? Then I tried to use the argument --resume_from_checkpoint (I don't know what this argument is doing, sorry) instead of --load_model_from. But this doesn't work, too. Following error: |
Ok I fixed the first error. My version of pytorch_lightning was to old. |
@Jochen-sys It means that you insert an checkpoint file as default or truh the terminal(set required false if you set it as default) and the training is being resumed from this checkpoint. U basically use it to resume training if you stopped the training, if you want to test the checkpoint(model that you create in optimize_graph.py) or if your pc shuts down for an unknown reason while training. |
@CracKCatZ Do you know why loss could be "nan"? At the beginning it worked with a real float, but now I only see this string there. I researched this, but didn't find a good cause. |
@Jochen-sys yes Cuda and cudnn are not installed the right way. U can search on YouTube for videos for a correct cuda and cudnn installation:) |
@CracKCatZ Ouh ok that's interesting thanks. I'm using my CPU. |
@Jochen-sys are you working with mp3 files now? |
With Windows, no, because it doesn't work there with mp3 files, but it is working with Linux. I'm training with my Windows system, I only have Linux as vm. |
@CracKCatZ Do you know what this ken_lm is for and where I could get it? Is this the file which improved the transcription in the video so much? When not, what was the file which improved the transcription so much? |
@Jochen-sys did you already tested the speechrecognition? |
@CracKCatZ Yes with the zip model. But it's not so good. But I remember that in the video he used something else, too, to get good results. |
@Jochen-sys hold up did u used the portaudio library because I think that this library is required and can give better results. Could you please tell me if you have portaudio already installed at the beginning of working with this project or if you have to install it? |
@CracKCatZ Sorry for late response. Yes, I think so. |
Ok I tried some things and it seems that there is a problem with the big letters but only in the first line. The second line doesn't care about big letters. |
@CracKCatZ Hi.. I am going to set my mp3 files. .. Actually I going to know where my mp3 files should be set? I mean which section should be linked to mp3 files.. scripts\common_voice_json.. ? right? |
Hello first of all nice work!
I wanted to ask if I could use mp3 files instead of wav files and which lines I have to change for that, if this is working?
The text was updated successfully, but these errors were encountered: