Creating a Managed Scraper

Updated by Devinder Singh

Introduction

This document is your comprehensive resource for setting up a managed scraper. Please ensure that the website you intend to scrape is supported by our managed scraper service. If not, follow the steps outlined in the Request a site article.

Steps

  1. Navigate to the Byteline Managed Scraper dashboard.
  2. Within the "Configured Web Scrapers" section, click on the "Add Scraper" button.
  3. Enter the URL of the webpage you want to scrape.
  4. Choose the specific fields you want to extract from the webpage and click the Next button.
  5. The test run data will be shown on the next screen. These are sample records from the page. "Use in automation" or "Export entire scrape" can be used to extract the complete data.

Export scraped data in CSV format

To export the entire scrape data in CSV format, follow these simple steps:

  1. Click on the "Export entire scrape" link on the managed scraper test run page.
  2. Once initiated, your scraper will be marked with a "Running" status on the dashboard. Keep an eye on the dashboard to track the progress of your scraper.
  3. Once the scraping process is complete, the download button will be enabled. Click on it to retrieve the CSV file containing your scraped data.
CSV Escape Rules

Our system automatically handles the escaping rules when generating or processing CSV files. If any field contains a comma, or double quotes, the field is enclosed in double quotes, and double quotes within the field are escaped by doubling them. This ensures that generated CSVs are fully compliant with CSV standards and can be correctly interpreted by spreadsheet applications and other tools.

Summary of Rules:

  • Fields with commas: Enclosed in double quotes.
  • Fields with double quotes: double quotes are escaped by doubling them and the entire field is enclosed in double quotes.
  • Fields with no special characters: No additional formatting is needed.

This ensures that all data is correctly represented in CSV format and can be exported or imported seamlessly across various applications.

Examples:

Input Data

CSV Output

New York, USA

"New York, USA"

He said, "Hello"

"He said, ""Hello"""

SampleValue

SampleValue


How did we do?