HTML parsing using gem Nokogiri in Rails

parsing html is easy using Nokogiri, to do follow the steps

    Step 1 : Open the terminal and go to your rails application folder.

  • $ cd yourRailsApp
    add  the following line to your Gemfile
    gem  ‘nokogiri’
    First install the dependencies

    $  sudo apt-get install libxml2 libxml2-dev libxslt1-dev
    

    For Fedora do

    $ sudo yum install libxml2-devel libxslt-devel
    

    For CentOS:

    $ sudo yum install -y gcc ruby-devel libxml2 libxml2-devel libxslt libxslt-devel
    

    do bundle install
    /yourRailsApp $ bundle install
    Your nokogiri installation is completed. Parsing HTML documents look like
    doc = Nokogiri::HTML(html_document) and parsing XML documents look like
    doc = Nokogiri::XML(xml_document)
    Nokogiri converts HTML and XML documents into a tree data structure. Nokogiri extracts data using this tree. Xpath and CSS are small languages used for tree traversal. Xpath is a language that was written to traverse an XML tree struture, but we can use it with HTML tree as well.

  • Lets try an example take the folowing html code

    Step 2 : In views/form.erb add

  • Add a text field in your form
    <%= text_field_tag(“my[url]”, @user.url) %> and enter the sample code in this text field.
  • Step 3 : In controller add

  • @user.url = params[“my”][“url”]
    call the function
    @user.get_value_from_url
    write the function in the model
    def get_value_from_url
    node = Nokogiri::XML(self.url)
    node.xpath(‘//param[@name=”pic”]’).each do |node|
    @result = node.attribute(“value”).to_s
    end
    @result
    end

Then you get the value : “http://www.my_site.com/v/xedFamzsaaxo7hc0?fs=1&amp;hl=en_US&#8221; in the result. Look its so easy to use nokogiri. Its Wonderful !!

You can also view Aaron Patterson’s (creator of Nokogiri) blog from here.

Advertisement

Author: Abhilash

I'm Abhilash, a web developer who specializes in Ruby development. With years of experience working with various frameworks like Rails, Angular, Sinatra, Laravel, NodeJS, React and more, I am passionate about building robust and scalable web applications. Since 2010, I have been honing my skills and expertise in the Ruby on Rails platform. This blog is dedicated to sharing my knowledge and experience on topics related to Ruby, Ruby on Rails, and other subjects that I have worked with throughout my career. Join me on this journey to explore the exciting world of web development!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: