Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Use of cookie jar

Eric — Dec 22, 2017 10:17AM CET

I use term-extraction of which feed reader is part. I see that there is a library wich contains a script to use a cookie jar. But I do not find any documentation on how to use this. The comments in the script are not enough for me.
In the Netherlands we have very strict cookie rule for websites so a cookie jar is essential.
Do you have some info in using this feature?


1 Community Answers

Keyvan Minoukadeh - Dec 22, 2017 at 11:51AM CET

FiveFilters.org Agent

Hi Eric,

Cookie Jar is used by our HTTP library (Humble HTTP Agent) to store cookies sent in the HTTP response. These are only stored temporarily for use in (if any) subsequent requests resulting from the original request. Often sites will set a cookie and issue a HTTP redirect and the stored cookie must then be sent to the new URL for the request to succeed. That's the purpose of CookieJar.php.

I suspect the cookie situation in the Netherlands you refer to is a slightly different one: visitors having to accept that a site will be using cookies before the actual content is sent. From the examples I've seen, this is often an interstitial webpage with a 'continue' link or a 'accept cookies' link leading to the actual content. In Full-Text RSS, we deal with these pages using site configuration files. So, for example, if we know that a website, let's say example.nl is presenting these interstitial cookie warnings pages, we create a site configuration file called example.nl.txt with something like the following:

#bypass cookie check
single_page_link: //a[contains(@href, '/cookiewall/accept')]

Here we tell Full-Text RSS to look for a link on the page containing '/cookiewall/accept' in the URL and to follow it.

You'll find real examples used in Full-Text RSS here:

https://github.com/fivefilters/ftr-site-config/blob/5aa171b1b3cff4c5c7a0b3f27803feab2330c1d7/volkskrant.nl.txt

https://github.com/fivefilters/ftr-site-config/blob/336cf8954fc3a7994f3b2673103a6ccad914d2a2/ad.nl.txt

https://github.com/fivefilters/ftr-site-config/blob/5aa171b1b3cff4c5c7a0b3f27803feab2330c1d7/parool.nl.txt

Hope that's some help.

rated : 1 Up Down

Post Your Public Answer