The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Process Documents operator ends up with more documents than example set
I have an example set with 733 examples and with text as an attribute, I process it into TF-IDF using the data to documents and the process documents operators, inside the process documents I clean up and tokenize the text then I use the Write Document and I end up with 1466 files
Why do I end up with more twice as many files as examples?
How do I ensure that 1 document in = 1 document out ?
I have extract content set to negate every tag possible but I end up with 2 outputs for every 1 input. From a brief look it seems file 734 is simiilar to file 1 so its like the whole thing loops twice for some reason
Why do I end up with more twice as many files as examples?
How do I ensure that 1 document in = 1 document out ?
I have extract content set to negate every tag possible but I end up with 2 outputs for every 1 input. From a brief look it seems file 734 is simiilar to file 1 so its like the whole thing loops twice for some reason
Tagged:
0
Answers
That way I can properly see what's happening.
Thanks