Full-Text RSS

Extracting artciles from non rss pages

When extracting articles from a category page in a site that doesnt have rss, I am using single_page_link: [XPath] However, it is extracting only the first article. Can I use some sort of loop so it kee...

1 Agent Answer

Feed burner messing up with the dc:identifier

Have you noticed that when using a feed from feedburner, there's some google analytics code added the dc:identifier? I don't know about other users, but for us it's a major issue as we use this U...

2 Agent Answers

Need HTML output not XML and remove header

1) HTML output instead of XML output. The encoding of the page affects the process I have running it through the Simple DOM Parser, which produces the special characters. I use the html=1 flag. The source has...

1 Agent Answer

Full text does not capture images from news.yahoo.com

Full text does not capture images from news.yahoo.com

2 Agent Answers

Content returns with special characters

I am using the feed reader to populate a wordpress site, but your reader keeps on returning special high-ascii value characters that are causing my inserts to fail. For example the word "I'm" ...

2 Agent Answers

2.7 is the last open release?

Is it still open-source or have you stopped publishing pre-latest code?

1 Agent Answer

Local image caching

Is it possible to add option for caching of large images and converting small ones into data-URI form when downloading and then replacing links in the feed to those of cached images? I think it will be great fe...

2 Agent Answers

Releasedate 3.0

Hi there I bought your Solution some days ago. Unfortunatly i didn't realise that there are some problems with google reader and it's updateintervall. Is there any releasedate for 3.0 becouse at the ...

4 Agent Answers

match div's id suffix

Hello, In a site I am interested in, the id of the div that contains the full article changes from time to time. Right now this works: body: //div[@id='_ctl10__ctl0_Article'] But soon the id will ...

1 Agent Answer

"readability" tag

Hi, i'm having trouble with the "readability" tag and i would like to disable it, it deforms my generate page and breaks it... Is there a way i can disable it? i'm using a self hosted ...

2 Agent Answers