BBC News - Content only

Hi.

I have recently purchased Full-Text RSS with the aim of scraping news articles from an RSS feed but I only want to capture the title & contents of the articles for data mining purposes. How might I go about this? I am unfamiliar with the configuration language and am struggling to understand how to specify these for my output. If anyone could help with a sample or just some insight I would be most appreciative.

Thank you.

Hi Patrick,

Full-Text RSS does not store the results it produces on disk (except for caching purposes). It’s purpose is to converts partial RSS feeds (such as the BBC’s) to full text - giving you the full content in the RSS feed or, for developers, in JSON format. If you’re looking to store the content it produces in some way, you will need to use the feeds Full-Text RSS generates with an RSS reading application that can do this for you - something that will monitor an RSS feed which you give it and store the contents for you to allow you to search it or do whatever it is you want to do with it later.

If you’re curious, here’s how one person used the software, along with Microsoft Outlook, to store news articles for research purposes: https://storify.com/MorganRamsay/how-often-do-video-game-journalists-write-about-fe