Extraction Tool: Extract Information from a Web Page

Extract Information from a Web Page

To maintain a website, you need to know what is on each page. Can you answer the following questions for your website?

  • Are there 'dead' links (links that are unreachable on the Web)?
  • Does the page have a title and meta description tag?
  • Is Google Analytics installed on the page?
  • Does the page have inline scripts and css?
  • Do images have the 'alt' attribute set?
  • Is descriptive anchor text used?
  • Is compression (gzip) used to serve the page and all the links?
For more details on the information retrieved, visit the extraction tool explanation page.

Using the Extraction Tool

Enter a Web page address copied from the address bar of the browser, click the 'Retrieve Info' button and retrieve the following information:

  • summary of Web page
  • HTTP headers sent from server
  • Does the page have the Google Analytics code?
  • The page title in the <HEAD> section of the HTML file.
  • Meta description tag from the <HEAD> section of the HTML file.
  • Meta keywords from the <HEAD> section of the HTML file.
  • H1, H2, H3 Tags
  • Linked Javascript files
  • Inline Javascript
  • Linked stylesheet files
  • Inline stylesheets
  • Anchors (links): href and link text
  • Images: file name and alt text

The Extraction Tool

Type a website address below or copy and paste the URL (ex., http://sonatainc.com) from the address bar of the browser. Depending on the number of links on the page, it could take a minute or more to determine the 'dead' links.

Web page name: