For the love of all that is holy, please create a way to designate old data sources that don't get refreshed so that Power BI doesn't have to reload them every time a source they're appended to is refreshed.
Example:
I have two .csv files as sources, each about 300MB. One houses old data that never changes; the other gets new rows added daily through a separate process. The old .csv is appended to the new .csv in Power BI. When I refresh the new .csv, the old .csv also loads. That's not too big of a problem in and of itself, but the mashup evaluation container explodes, growing to use about 10GB when loading the new .csv, and continuing to grow to consume all available RAM (I have 16GB) while loading the old .csv. About halfway through loading the old .csv, it grinds to a halt, as all of my RAM is used up and (I presume) it has to start using the page file. Loading these two .csv files takes about 45 minutes in Power BI, but loading them and rbinding them in R takes about 45 seconds. This is atrocious, shameful performance, and Microsoft needs to fix this issue -- which has been known for YEARS -- right now.
- Comments (2)
RE: Don't reload old data in append queries when refreshing
i had same issue until I found this solution, which hopefully will help: in query editor, right-click on the file that doesn't change, and uncheck the option that says 'include in report refresh'. That way only the new one will refresh.
RE: Don't reload old data in append queries when refreshing
Also, when viewing this particular source in the query editor, Power BI loads all 600MB of both .csvs in a reasonable amount of time (30 seconds or so), and the mashup evaluation container doesn't show up in the task manager. So extra garbage besides just loading data must be happening when refresh is clicked (outside of the query editor).