html - Cannot find the text inside `<span>` -


i have html below:

<div class="info">  <h5>    <a href="/aaa/">aaa </a>  </h5>  <span class="date">        8:27am, sep 30</span>      </div> 

i'm using ruby , want text "8:27am, sep 30" inside <span class="date">. cannot find via command below.

find('div.info span.date').text 

could please tell me why doesn't work? if find text inside h5 following command, can "aaa" correctly.

find('div.info h5').text 

full ruby code

then(/^you should see (\d+) latest items$/) |arg1|     within("div.top-feature-list")        # validate images of items exist, print report        expect(all("img").size.to_s).to eq(arg1)        puts "the number of items on current site " + (all("img").size.to_s)        # list of items' details (image, headline, introduction, identifier, url)        $i = 1        while $i <= arg1.to_i           puts "item no." + $i.to_s           puts "        - image:        " + find('ul.category-index li.item-' + $i.to_s + ' img')[:src].to_s           puts "        - headline: " + find('ul.category-index li.item-' + $i.to_s + ' div.info h5').text           puts "        - introduction: " + find('ul.category-index li.item-' + $i.to_s + ' div.summary').text           puts "        - url:      " + find('ul.category-index li.item-' + $i.to_s + ' div.info h5 a')[:href].to_s           puts "        - created date " + find('ul.category-index li.item-' + $i.to_s + ' div.info span.date').text           puts "        - identifier:   " + find('ul.category-index li.item-' + $i.to_s + ' div.img a.section-name').text           puts "        - subsection:   " + find('ul.category-index li.item-' + $i.to_s + ' div.img a.section-name')[:href].to_s           $i +=1       end     end   end 

more html

<div class="top-feature-list">    <ul class="category-index">     <li class="group">            <ul>     <li class="item-1 left ">         <a name="item-1"></a>         <div class="img">             <a href="/health-lifestyle/item1.html">                 <img alt="how to" src="//image_url">                  </a>              <a class="section-name test" href="/health-lifestyle/">                 lifestyle </a>         </div>         <div class="info">             <h5>                  <a href="/health-lifestyle/item1.html">                     how </a>              </h5>             <span class="date">                 10:20am, sep 30</span>          </div>         <div class="summary">              <p>                 summary text</p>          </div>       </li>     .... 

env.rb

require 'parallel_tests' require 'capybara/cucumber' require 'capybara/poltergeist' require 'rspec' 

parsing html super easy in ruby. need require 2 gems in program:

require 'open-uri' require 'nokogiri'  # set page going scan. page = nokogiri::html(open("http://google.com/"))  # (updated reflect date class provided in question) # extract specific elements via css selector. # first selects has span tag, # narrows down class of ".date" # use .strip remove whitespace html  page.css('span').css('.date').text.strip!   # => outputs "8:27am, sep 30" 

if want more information on parsing html ruby, need googling , reading it. 1 great resource started here.


Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -