Scraping: what is it and how to use it

Everybody knows about Instagram limits on likes, follows, comments, direct messages, etc.

But in fact, every type of requests to Instagram servers may be considered suspicious if made too often. If you've already activated the bot and use filters, then the most frequent request you make is... profile view. You may go through decades of profiles that don't fit your conditions before you find a profile that does fit. What if you will know exactly which profiles to open and which not?

It would significantly increase the time that bot can work before it's blocked. So here is the idea of scraping:

The idea of scraping

You create a "fake" account and do the filtering part in this account. Users that fit your filter's conditions are called "targets". They will be stored in the interaction_data.db file in your main account's folder. Then you can just run the bot with your main account and interact with these targets only.

The algorithm of using scraping technique in Insomniac

1. Create a "fake" Instagram account. E.g. if your username is @superhero, register user @superhero.scraper and log in to it. If this account gets blocked, it won't be a big trouble for you.

2. Run the bot with following command line arguments:
       --scrape-for-account <your real username>, e.g. --scrape-for-account superhero
       --scrape @natgeo amazingtrips (this one works just like --interact)

Don't forget to specify filters, because they are the main reason for the whole party.

You may also want to use:
       --scrape-limit-per-source 40-60 number of profiles-scrapping per each blogger/hashtag, disabled by default. It can be a number (e.g. 70) or a range (e.g. 50-80)
       --scrape-users-amount 3-8 add this argument to select an amount of sources from the scraping-list (sources are randomized). It can be a number (e.g. 4) or a range (e.g. 3-8)
       --total-scrape-limit 150 limit on total amount of profiles-scrapping during the session, disabled by default. It can be a number (e.g. 100) or a range (e.g. 90-120)

3. The bot will add all found targets to a special table in the main account's database. This database is a single file named interaction_data.db in the "superhero" folder (provided your username is @superhero as in our example). So, just let the bot to "scrape" targets for you.

4. Wait until scraping is finished (or stop it by Ctrl+C) and log in to your main account. Now run the bot with --interact-targets True command-line argument. Don't set --interact, --scrape and other actions. Insomniac will take usernames from the targets list and interact with them. Each target will be interacted only once. But they will still be stored in your database.

Using multiple scrapers

You can create multiple "fake" accounts and run scraping from all of them simultaneously. Each "scraping" instance of the bot will update the main account's database file interaction_data.db in the main account's folder.

You may notice that different scrapers will try to add same users as targets. That's OK. Targets won't duplicate in the main account's database. However, you may want to use --scrapping-main-db-directory-name superhero (provided your username is @superhero as in our example). This will make scrappers to log all their actions to the single database (not using their own databases at all) so the same user won't be opened twice by different scrapers.

Providing custom targets list

If you have a pre-collected list of your targets (from some other tools for example) you can use it too. Create a file "targets.txt" in the folder of your main account and write down usernames in this file. One username per line, no "@"s, no commas. Then run the bot with the same --interact-targets True argument. These targets will be moved from the file to the database, and their copy will be saved as "targets_loaded.txt" in the same directory.

Thanks for reading, and please note that the scraping technique requires filters and thus is available only for patrons that activated the bot by joining our $10 Tier.

Become a patron to

Unlock 1 exclusive post
Be part of the community
Connect via private message