Board Thread:Support Requests - Getting Technical/@comment-134861-20151019144320/@comment-1757994-20151020230345

Hmm, the code on dev wiki has the same problems as in the forum article, which isn't surprising, since they have the same original author. It won't necessarily find all the duplicate files if there are a lot of them, because it ignores the possibility of dfcontinue in favor of gaifrom exclusively and because it doesn't use dflimit.

Suggesting only limited edits to that source:

Edit 1  url = "/api.php?action=query&generator=allimages&prop=duplicatefiles&gailimit=500&format=json"; should be  url = "/api.php?action=query&generator=allimages&prop=duplicatefiles&gailimit=500&dflimit=500&format=json";

Edit 2  if (gf) { url += "&gaifrom=" + gf; } should be  if (gf) { if (gf.indexOf('|') > -1) { url += '&dfcontinue=' + encodeURIComponent(gf); gf = gf.split('|')[0]; } url += '&gaifrom=' + encodeURIComponent(gf); }

Edit 3  if (data["query-continue"]) findDupImages(encodeURIComponent(data["query-continue"].allimages.gaifrom).replace(/'/g, "%27")); should be  if (data['query-continue']) { if (data['query-continue'].duplicatefiles) { findDupImages(data['query-continue'].duplicatefiles.dfcontinue); } else { findDupImages(data['query-continue'].allimages.gaifrom); } }

Moving encodeURIComponent makes it easier to test for '|'. I'm unconvinced ' needs to be encoded if encodeURIComponent skips it.

I noticed Bobogoobo also removed the rate limit. The documentation about 500 files every 2 seconds should also be edited.

HTH