[FRIAM] scraping a web site
Marcus Daniels
marcus at snoutfarm.com
Wed Jan 4 01:09:51 EST 2017
Once you’ve got all the files (like below) Microsoft Word can import HTML files. Editors designed to HTML editing (e.g. Kompozer) will often have a “Open from web” option such that you can just type the URL. If you really want systematic scraping, look at libraries like Beautifulsoup (Python based), but that will involve some programming.
From: Friam [mailto:friam-bounces at redfish.com] On Behalf Of Tom Johnson
Sent: Tuesday, January 03, 2017 10:39 PM
To: Friam at redfish. com <friam at redfish.com>
Subject: Re: [FRIAM] scraping a web site
Nick;
You might try installing Firefox, if you don't already use it, and go here to add-on DownThemAll. I recall that you can set how many layers deep you want to go. Of course if you get ALL your content you will have to figure out where and how you want to repost it.
https://addons.mozilla.org/en-US/firefox/addon/downthemall/
Tom
On Jan 4, 2017 12:50 PM, "Nick Thompson" <nickthompson at earthlink.net<mailto:nickthompson at earthlink.net>> wrote:
Dear Phellow Phriammers,
I am in the uncomfortable position of being bound by threads of steel to Earthlink. Many, MANY, years I go I started a website on Earthlink, {http://home.earthlink.net/~nickthompson/naturaldesigns/
}, and put a lot of my writing, and some commentary up on it. The website creation and editing medium (trellix) was pretty good for its time, and there are many ways that I find the site quite satisfying. But gradually Earthlink has withdrawn its support, and now I am not sure I could get in to edit or change it. Meantime, Research Gate has gotten started, and provides a somewhat better place to meet the world and archive my stuff. And also, having the site on earthlink binds me to them and their 22 dollar a month fee. So. …
I am wondering if there is a way (or a service that would) scrape the website and, possibly, dump it into a new and more reliable, more website creation medium? Please, ambulatory knowledge only. I don’t want a people doing deep searches to answer this question .
Thanks, as always .
Nick
Nicholas S. Thompson
Emeritus Professor of Psychology and Biology
Clark University
http://home.earthlink.net/~nickthompson/naturaldesigns/
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/ by Dr. Strangelove
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://redfish.com/pipermail/friam_redfish.com/attachments/20170104/64bb3916/attachment-0002.html>
More information about the Friam
mailing list