Home Data Prep Q&A


Big News: we’ve moved to the DataRobot Community! Please keep your eye out for an email invitation to join us there. Refer to the We've Moved FAQ for a guide on how to use your existing Paxata Community account to login to our new home.

Visit the official Paxata Documentation portal for all of your doc needs.

Question regarding cluster + edit feature (ngram algorithm)

Hello,

I don't know how values are grouped when using ngram.

Could you please tell me that with simple sample data?

Best Regards,
Momoko

Best Answer

Answers

  • Momoko FukudaMomoko Fukuda Posts: 2
    edited April 2, 2020 5:39AM
    Hi,

    My apologies for the delay response and I appreciate your answer.
    Please let me ask you another question.
    I tried testing with the data below;

    test
    aaabcde
    fgaaahij
    klmnoaaa
    opraastu

    I set "3" at "NGram Size" as parameter and I thought the three values "aaabcde", "fgaaahij" and "klmnoaaa" could be grouped because they have "aaa", but it didn't work.
    What kind of data can I group with? or should I change the parameter?

    Best Regards,
    Momoko
Sign In or Register to comment.