Gregory J. Deckler on 21 Apr 2016 00:52:48
Provide the option of not removing duplicates automatically when creating R visualizations or provide the ability to create R datasets using the same syntax as shown in the comments when creating an R visualization
- Comments (16)
RE: "R" Don't remove duplicates
it has been eons, why is this still under review?
RE: "R" Don't remove duplicates
This is an old post, but still a relevant topic. By now the option to keep duplicates is still not implemented, but would greatly benefit people working with intended duplicated rows without having to add a unique ID (e.g. datasets with lot of categorical values, where duplicates are common). Any update on this topic?
RE: "R" Don't remove duplicates
This seem to be under review for two and a half year now. Any update?
RE: "R" Don't remove duplicates
yes, it should not be there. The histogram I have created is not proper one.
RE: "R" Don't remove duplicates
Please allow to keep duplicates. They are also part of historical data, need it in our forecasting in power BI
RE: "R" Don't remove duplicates
I strongly agree - I should be possible to turn off remove duplicates as an option.
RE: "R" Don't remove duplicates
Please do the same for Python!
RE: "R" Don't remove duplicates
Microsoft always adds useless features that always are dumb. The very least, for this stupid feature, is to have it disable-possible.
RE: "R" Don't remove duplicates
The work around for this is the use of an Index column. However, this does not work if the data is coming from different tables!!! I would have to create a new table with a index column for that new table containing the variables of interest. This would defeat the purpose of using a rational data structure and other aspects.
RE: "R" Don't remove duplicates
Why does it remove duplicates by default? When performing univariate qualitative analysis, I want to be able to drop in a single qualitative field. This means I WANT duplicates and having keys complicates the analysis meaninglessly.
That automatic removal should be made an option. I believe that it was added because of the limitation of R scripts in Power BI to 150k rows. The removal of duplicates by Power BI (in what I assume to be some sort of pre-processor directive like call judging by the invalid code syntax shows in the editor) probably helps mitigate that limitation in certain types of data sets. Unfortunately, without the option to turn off that "pre-processor" like call, an entire segment of potential analysis is complicated or even impossible (if the original data set has no key).