I wrote a script to sync Google Analytics (GA) pageviews to own DB in daily basis. E.g.

  • 2019-12-01 -> 342 pageviews
  • 2019-12-02 -> 621 pageviews
  • 2019-12-03 -> 781 pageviews
  • 2019-12-04 -> 388 pageviews
  • 2019-12-05 -> 562 pageviews
  • ...
  • 2019-12-31 -> 597 pageviews

So that I can generate report (in my web app, not GA) and filter by date range.

But then, I encounter an issue, when I filter the report range from 2019-06-01 to 2019-06-30, the total pageviews are different from GA report.

Then I cross check it day by day, and both (GA & web app) are tally.

Google Analytics pageviews compare - 2019-06-01 to 2019-06-21

👆 from 2019-06-01 to 2019-06-21, both are tally

Google Analytics pageviews compare - 2019-06-01 to 2019-06-22

👆 from 2019-06-01 to 2019-06-22, the result are different. From here, I assume the result for 2019-06-22 has problem

Google Analytics pageviews compare - 2019-06-22

But when I cross check for 2019-06-22, both are tally again 🤔

Google Analytics pageviews compare - 2019-06-19 to 2019-06-22

👆 from 2019-06-19 to 2019-06-22, the result are tally also.

Then after googled for a while, I think is caused by the GA sampling, perhaps from range 2019-06-01 to 2019-06-22, the data set are too large, thus Google just pick a sample set of data. See the references 👇

References: