Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Facing multiple issues while running src/flat_retrieve.py #7

@karthickpgunasekaran

Description

@karthickpgunasekaran

Hello All,
I am trying to run SentAugment as a part my project for clustering purposes but facing multiple issues trying to run it. I am using a part of the CommonCrawl data for this purpose.

Issue 1:
File "src/flat_retrieve.py", line 37, in
_, indices = torch.topk(scores, params.k, dim=0) # K x Q
NameError: name 'params' is not defined

File "SentAugment/src/flat_retrieve.py", line 42, in
for k in range(K):
NameError: name 'K' is not defined

Proposed Solution:

Is it a bug? Should it be args.K instead of params.k and just K?

Issue 2:
File "src/flat_retrieve.py", line 43, in
print(IndexTextQuery(txt_mmap, ref_mmap, indices[k][qeury_idx]))
File "/home/username/FAIRCluster/SentAugment/src/indexing.py", line 95, in IndexTextQuery
return b[0:i].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xca in position 8: invalid continuation byte

I followed all the steps mentioned but getting the following error when run the step 3. Can somebody help with that?

Issue 3:
After removing the decode and trying to run it,

File "src/flat_retrieve.py", line 43, in
print(IndexTextQuery(txt_mmap, ref_mmap, indices[k][qeury_idx]))
File "/home/username/FAIRCluster/SentAugment/src/indexing.py", line 92, in IndexTextQuery
while txt_mmap[p+i] != 10 and i < dim:
File "/home/username/anaconda3/envs/envConda6/lib/python3.7/site-packages/numpy/core/memmap.py", line 331, in getitem
res = super(memmap, self).getitem(index)
IndexError: index 25580 is out of bounds for axis 0 with size 8000

Any guess on whats wrong here?

Thanks in advance. Any help appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions