Hi,
thanks for developing such an efficient and needed tool!
I have been looking around this and other repositories of ML4GLand to find examples of best practices to read a genome fasta and a bam or bed file to produce one hot encoded sequences and corresponding coverage arrays. However, in most cases I see reference to an already existing zarr object. Is such an example of dataset making already available?
I saw the API documentation reference and can guess how to do it, but I am unsure whether I would end up doing it in the most efficient way. I hope I did not miss something...
Thanks very much in advance, best,
Miquel
Hi,
thanks for developing such an efficient and needed tool!
I have been looking around this and other repositories of ML4GLand to find examples of best practices to read a genome fasta and a bam or bed file to produce one hot encoded sequences and corresponding coverage arrays. However, in most cases I see reference to an already existing zarr object. Is such an example of dataset making already available?
I saw the API documentation reference and can guess how to do it, but I am unsure whether I would end up doing it in the most efficient way. I hope I did not miss something...
Thanks very much in advance, best,
Miquel