How to scrape a page without knowing its html structure

I’m doing one module in my site, there I need to import user blog into

my site. I can use RSS feeds to read the blog information but using

RSS feeds I’m not getting entire information. So, I need to scrape the

user blog page. How to scrape a pages without knowing its html

structure of a page? Please anyone can help me for this issue. Thanks

in advance.

You asked this exact question 4 days ago and got 2 answers, that
basically you can't -- you have to know *something* about way the
pages are marked up.

It's still true. :slight_smile:

It seems that looking at the structure would be the easiest way, but
if you wanted something more complex...your scraping program could
infer the layout structure and separate this from the content. Your
program would need to be fed multiple pages and would assume the
layout to be the portion that stays mostly the same from page to
page. That's an oversimplification, but that's the general idea.

Good luck.