Hello All,
I am trying to run SentAugment as a part my project for clustering purposes but facing multiple issues trying to run it. I am using a part of the CommonCrawl data for this purpose.
Issue 1:
File "src/flat_retrieve.py", line 37, in
_, indices = torch.topk(scores, params.k, dim=0) # K x Q
NameError: name 'params' is not defined
File "SentAugment/src/flat_retrieve.py", line 42, in
for k in range(K):
NameError: name 'K' is not defined
Proposed Solution:
Is it a bug? Should it be args.K instead of params.k and just K?
Issue 2:
File "src/flat_retrieve.py", line 43, in
print(IndexTextQuery(txt_mmap, ref_mmap, indices[k][qeury_idx]))
File "/home/username/FAIRCluster/SentAugment/src/indexing.py", line 95, in IndexTextQuery
return b[0:i].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xca in position 8: invalid continuation byte
I followed all the steps mentioned but getting the following error when run the step 3. Can somebody help with that?
Issue 3:
After removing the decode and trying to run it,
File "src/flat_retrieve.py", line 43, in
print(IndexTextQuery(txt_mmap, ref_mmap, indices[k][qeury_idx]))
File "/home/username/FAIRCluster/SentAugment/src/indexing.py", line 92, in IndexTextQuery
while txt_mmap[p+i] != 10 and i < dim:
File "/home/username/anaconda3/envs/envConda6/lib/python3.7/site-packages/numpy/core/memmap.py", line 331, in getitem
res = super(memmap, self).getitem(index)
IndexError: index 25580 is out of bounds for axis 0 with size 8000
Any guess on whats wrong here?
Thanks in advance. Any help appreciated!
Hello All,
I am trying to run SentAugment as a part my project for clustering purposes but facing multiple issues trying to run it. I am using a part of the CommonCrawl data for this purpose.
Issue 1:
File "src/flat_retrieve.py", line 37, in
_, indices = torch.topk(scores, params.k, dim=0) # K x Q
NameError: name 'params' is not defined
File "SentAugment/src/flat_retrieve.py", line 42, in
for k in range(K):
NameError: name 'K' is not defined
Proposed Solution:
Is it a bug? Should it be args.K instead of params.k and just K?
Issue 2:
File "src/flat_retrieve.py", line 43, in
print(IndexTextQuery(txt_mmap, ref_mmap, indices[k][qeury_idx]))
File "/home/username/FAIRCluster/SentAugment/src/indexing.py", line 95, in IndexTextQuery
return b[0:i].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xca in position 8: invalid continuation byte
I followed all the steps mentioned but getting the following error when run the step 3. Can somebody help with that?
Issue 3:
After removing the decode and trying to run it,
File "src/flat_retrieve.py", line 43, in
print(IndexTextQuery(txt_mmap, ref_mmap, indices[k][qeury_idx]))
File "/home/username/FAIRCluster/SentAugment/src/indexing.py", line 92, in IndexTextQuery
while txt_mmap[p+i] != 10 and i < dim:
File "/home/username/anaconda3/envs/envConda6/lib/python3.7/site-packages/numpy/core/memmap.py", line 331, in getitem
res = super(memmap, self).getitem(index)
IndexError: index 25580 is out of bounds for axis 0 with size 8000
Any guess on whats wrong here?
Thanks in advance. Any help appreciated!