Add UTF-8 support in data (lookup tables)

Need to find a solution for dealing with `åäö`. See examples below:

``` r
library(Coldbir)
a <- cdb()
dt <- data.table(
    x = c('a', 'b', 'a', 'o', 'a', 'o', 'o'),
    y = c('a', 'b', 'å', 'ö', 'a', 'ö', 'ö')
)
a[] <- dt
# Warning message:
# In `[.data.table`(y, xkey, nomatch = ifelse(all.x, NA, 0), allow.cartesian = allow.cartesian) :
#   A known encoding (latin1 or UTF-8) was detected in a join column. data.table compares the bytes 
# currently, so doesn't support *mixed* encodings well; i.e., using both latin1 and UTF-8, or if any 
# unknown encodings are non-ascii and some of those are marked known and others not. But if either 
# latin1 or UTF-8 is used exclusively, and all unknown encodings are ascii, then the result should be ok. 
# In future we will check for you and avoid this warning if everything is ok. The tricky part is doing this 
# without impacting performance for ascii-only cases.

a[]
#    x y
#1: a a
#2: b b
#3: a  
#4: o  
#5: a a
#6: o  
#7: o  
# Warning message:
# In `levels<-`(`*tmp*`, value = c("a", "b", "", "")) :
#   duplicated levels in factors are deprecated
```

`lookup.txt` for variable `y`:

```
1       a
2       b
3
4
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add UTF-8 support in data (lookup tables) #95

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add UTF-8 support in data (lookup tables) #95

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions