Note that before this article you may probably read about web scraping sftwares and tools for SEO and marketing. Any seasoned marketer is aware of the importance of having a carefully planned prospecting list, whether it to release a press statement regarding your client's most recent innovation or a campaign on social media targeting the top industry players, or to reach other people in your industry to get links. This is a fantastic article called How to Use Link Juice for further information. But getting to having a complete, well-thought-out list of potential customers takes patience and an attentive eye. This usually consists of looking through websites which happen to feature your keywords in a single out of context article. But what if there were an opportunity to save time and get rid of at least a portion of the work?
When you are prospecting in any way, it's likely to be something like a manual Google search, either entirely manually, or using tools such as MozBar Chrome Extension . MozBar which lets you save the SERPs (search engine results page) into an easy Excel spreadsheet. Whatever way you choose, prospecting by hand requires a significant amount of time, and is always beneficial, regardless of the type of project.
However, there's a way to alter that.
Scrapebox as well as other applications it is a tool that completes all your Google searches for you, transforming your search terms into a list of possible prospects in just a few minutes.
The first step is to input your keywords. If you're only looking for just a handful of keywords that you want to use, just start typing your keywords directly in the keyword box. However, if you've got an established list of keywords that you want to import, you can do so directly into Scrapebox through an .txt file and click the Import button.
Let's suppose you have a listing of fifty keywords, which may, apart from pertinent to your requirements for prospecting may also yield lots of irrelevant results, such as e-commerce listings. The merge feature of Scrapebox (displayed in the form of an "M" button) lets you import a different set of modifiers to keywords as well as the .txt file. It joins each modifier to every keyword, thereby saving the hassle of creating a new list of keywords within Excel as well as manually adding every keyword.
Depending on the amount of keywords you're trying to find, it could be beneficial to purchase some proxy servers. This is because of Scrapebox's nature that allows you to run several searches is a straightforward method to have your scrape blocked with an enigma. Proxy servers can create the illusion that multiple searches are happening across the globe.
The use of proxy servers is not required, but is highly advised.
If you decide to use proxy servers in the next step, you'll need to load these into Scrapebox. To do this simply click the Manage button located in the section for proxies, then upload your .txt file with your proxies, and then input the information associated with it (username passwords, usernames, etc.) and you're done on this front .
Usually, you'll find your complete list of proxies. But, we decided not to share ours with you today.
There is a service called VPS for Seo. Virtual private server providers, offer a service designed for SEO experts who want to use tools like Scrapebox, GSA SER, XRrumer and SeNuke. That is a really useful methos for using this kind of SEO tools, but always keep in mind that you have to stay white hat.
When you've got modestly-sized list of keywords, but no advanced operations (under 50 is usually able to work flawlessly without errors) All you have to do is make sure Custom Harvester is selected, and then set the amount of outputs (100 -200 is typically sufficient, as the majority of results that go beyond that produce less-quality results).
When you hit the start harvesting option, you'll be presented with the choice of what search engines you'd prefer to make use of. Once you've made your choice then click Start and Scrapebox will start collecting all your URLs. It's likely to take about a minute; however, if you have more keywords, you may have to wait for a couple of minutes longer.
Managing Your Data
After your scrape has finished operating, Scrapebox offers you the possibility of exporting your complete list of (and sometimes not-completed) keywords. Then you'll have your list of URLs that you've harvested. If your marketing relies more heavily on a domain's URL as opposed to the contents or the context of the page, you are able to delete any domains in Scrapebox in the Remove / Filter tab that is on the right side. If, however, your prospecting demands you to look at deeper analysis of the specific outcomes of your search it is recommended to put the results in multiple URLs for each domain to increase your chances of obtaining relevant results.
After trimming everything to and you're ready to export, visit the list of Export URLs, and export to the format of your choice. Save it, and then you're done!
If you start running scrapes comprising hundreds, dozens or many thousands of search phrases you'll soon realize that your scrapes aren't delivering any results, rather resulting in error codes. This happens when Google can detect that your proxies have been running automated search queries. How do they know? As I said earlier, Google is able to identify when your IP is performing multiple searches over an extremely short amount of time. It then asks for captcha verification in order to stop this type of automated search.
Luckily, (or unfortunately, depending on the person you are asking) proxy services are unable to comprehend the meaning of these words, which means that Google can temporarily stop them from doing any other search. The captcha prompt is activated faster whenever advanced query types are utilized in rapid succession than simple ones are. This means that a huge scrape involving advanced queries needs a delay to prevent proxy servers from being used excessively.
To do this, check Detailed Harvester in the Harvester and Proxies section. When you select Start Harvesting You'll see an interface similar to that of the Custom Harvester, but with an option at the bottom left-hand corner that allows you to include the option of a delay. A shorter list of 50 to 100 advanced queries will be sufficient with a delay of 30 to 60 seconds while a more extensive list may require as long as 300 minutes (and that implies that you may require scrapes to run for a long time, or even for a couple of days at times when you are planning to run the entire list in one go). The delay feature can be beneficial if several people within your workplace are sharing the same proxy and using Scrapebox often, since it will mean that Google can more easily block your proxies from being employed.
I'm Getting a Lot of Bloggers From some place in the world * in My Results
You may notice that you're seeing a significant amount of results on websites that are not of value to your marketing because of their place of operation. This can be due to numerous proxies distributed across the globe and, consequently, sometimes returning result from their "home" location. If you are using a significant number of proxies, you'll be in a position to fix this by temporarily deactivating certain proxy servers within Scrapebox.
To access this feature, simply click on the Manage button in the titled proxies section. Then, you will be able to view the complete list of your proxies along with their details such as where they are located. You can choose proxies from regions that return many irrelevant results pressing Control + C, and then remove them using to remove the selected proxy under the filter tab. Remember to load your entire list of proxy servers to be able to prospect them again in the future!
Sometimes it can be useful to determine which keywords yielded the results you were looking for, such as during the initial phase of prospecting when you're still trying to determine the keywords and operators that work best for you and which ones are working to your advantage. . and which return unrelated results. You're in luck, because, Scrapebox has a way to allow you to do this.
Then, choose "Connections, Timeout and Other Settings" from the Settings drop-down menu at the top.
Then next, under "More Settings", click the "More Harvester Settings" tab, check the box saying "Save additional keywords using URLs. You can then exit by clicking OK. Then, proceed with your scrape as usual. When your scrape is completed you'll be able to view and export your URLs list in the same way as you normally would.
To find your keywords, you need to get to your Harverster_Sessions folder (which should be hiding wherever you installed Scrapebox, if you have not had to access it before), where you'll find a .txt file titled "kw_urls_MM-DD- YYYY_HH -MM-SS ". What you have to do is copy and paste it into Excel and work on some formatting with text-to-columns, and you'll end up with an alphabetized list of URLs as well as their keywords!
For a handful of tips and tricks on this tool check out this YouTube playlist .