Conversation
|
This looks good to me, I'm wondering if it's possible to somehow generate a list of each package/version. Also I think the reason for excluding the datasets is to reduce the size? What about we tar the datasets folder? |
|
hi, thanks for the comment.
Do you mean having a Docker image for each library?
Yes, the dataset directory is 2GB, which is too big to store in the container. If we keep it as a archive (.tar.gz), the size is around 700MB. I'm not sure whether it's worth. What do you think? |
Yeah, not sure that is something we should do since, in this case we would have to update not only the docker file that contains all libs but the single lib as well.
I see, I'd like to keep it as simple as possible, sharing the dataset folder might not be the easiest solution. Can you think of anything else, what we could do? Perhaps 700MB isn't that bad? |
|
Before investigating further, may I ask how is your plan to run this container? |
|
The easiest for me would be to have something that runs out of the box |
This is a first Dockerfile that aims to make the system more portable and easier to be run, addressing #133.
The Docker file is structured such that
This image is built using a modified config.yaml. In particular, Shogun's KMEANS and DTC sections are:
Results from executions.
Suppose relevant datasets have been downloaded already from
make datasets. The image is built using the following commanddocker build -t benchmark.Could you please give me feedback or comments? Meanwhile, I will add more libraries to the image.
Update :
heytitle/mlpack-benchmarks.