parsing html is easy using Nokogiri, to do follow the steps
- $ cd yourRailsApp
add the following line to your Gemfile
gem ‘nokogiri’
First install the dependencies$ sudo apt-get install libxml2 libxml2-dev libxslt1-dev
For Fedora do
$ sudo yum install libxml2-devel libxslt-devel
For CentOS:
$ sudo yum install -y gcc ruby-devel libxml2 libxml2-devel libxslt libxslt-devel
do bundle install
/yourRailsApp $ bundle install
Your nokogiri installation is completed. Parsing HTML documents look like
doc = Nokogiri::HTML(html_document) and parsing XML documents look like
doc = Nokogiri::XML(xml_document)
Nokogiri converts HTML and XML documents into a tree data structure. Nokogiri extracts data using this tree. Xpath and CSS are small languages used for tree traversal. Xpath is a language that was written to traverse an XML tree struture, but we can use it with HTML tree as well. -
Add a text field in your form
<%= text_field_tag(“my[url]”, @user.url) %> and enter the sample code in this text field. -
@user.url = params[“my”][“url”]
call the function
@user.get_value_from_url
write the function in the model
def get_value_from_url
node = Nokogiri::XML(self.url)
node.xpath(‘//param[@name=”pic”]’).each do |node|
@result = node.attribute(“value”).to_s
end
@result
end
Step 1 : Open the terminal and go to your rails application folder.
Lets try an example take the folowing html code
Step 2 : In views/form.erb add
Step 3 : In controller add
Then you get the value : “http://www.my_site.com/v/xedFamzsaaxo7hc0?fs=1&hl=en_US” in the result. Look its so easy to use nokogiri. Its Wonderful !!
You can also view Aaron Patterson’s (creator of Nokogiri) blog from here.