In order to apply LLM2Vec to DictaLM we need: - [x] Identify base model - https://huggingface.co/collections/dicta-il/dicta-lm-20-collection-661bbda397df671e4a430c27 - [x] Enable bi-directional attention - [x] Prepare dataset for MNTP training - https://mboyanov.github.io/2024/08/31/BiMNTP.html - [x] Create a json configuration: - https://towardsdatascience.com/turn-llama-3-into-an-embedding-model-with-llm2vec-8448005f99aa - https://github.com/mboyanov/bg2vec/blob/master/model_configurations/bggpt-7b.json - [x] Run the `run_mntp` script against the model and configuration file (see resources above) - https://github.com/mboyanov/bg2vec/blob/master/1.%20Bi-MNTP%20Training.ipynb - [x] Test new model created - [x] Finetune using unsupervised contrastive learning - https://mboyanov.github.io/2024/09/11/SimCSE.html
In order to apply LLM2Vec to DictaLM we need:
run_mntpscript against the model and configuration file (see resources above)