View Full Version : Re-Saving web pages to disk
mike2002
07-29-2003, 12:36 PM
In Internet Explorer, when webpages are saved to disk, the majority can (should the user wish to do so) be re-named and re-saved at a later date . But one encounters a certain page that, unless the original is renamed at the time of Saving, won't let you subsequently RE-NAME it. It goes through the process of Saving, then displays an error that the page could not be Saved.
Despite examining the contents of the page's Directory folder I cannot see what 'built in' method the page uses to achieve this.
Anyone know?
:confused:
Variable
07-29-2003, 05:54 PM
Could it be a script? Can you view the page as a HTML document? See anything weird in there?
V
mike2002
07-30-2003, 06:46 AM
By "View the page as an HTML document", I take it you mean 'View', then 'Source' on the Toolbar. I don't really know what I'm looking for among all the 'hieroglyphics' (laugh).
If you'd care to try an example by re-saving a page, try this URL:
http://computing.kelkoo.co.uk/b/a/sbs/uk/graphicsCards/name/radeon+9000/type/Graphics+Card/manufacturer/I/111601.html
After the page loads, save it as 'Web Page, complete'. Then go off-line, load the page from your disk, then try saving it again. Whether you re-name it or keep the original name makes no difference but, like me, you'll find that it refuses to be re-saved.
You can only Save some particular pages as plain 'WEB Page, HTML only', as they refuse to do so either by 'Web Page, complete' or by 'Web Archive, single file' (in any case, the latter takes up more disc space for some reason).
There are no doubt those among you who are thinking why don't I just save the page as HTML only. Well, apart from losing any graphics, sometimes (but not always) you lose any 'links' on a page. Instead, they display as 'file' etc. So unless you know the URL of the original page, you have to start searching for it - more time consuming.
Why should I want to re-name pages after saving? As an example a page might be named 'The Windows page'. It would be easier to find if 'The' is removed, then I can look for 'Windows page'. Am I making sense?
Paul Komski
07-30-2003, 04:51 PM
Unless you want to drag their associated "Files Folders" around with any Saved As "Web Page Complete" htm files, you will have a number of problems when you later wish to "manipulate" them - though this can often be achieved by using an authoring package such as Front Page if you are determined - and create your own "Webs" on your HDD.
Another useful approach for archiving saved web pages is to "Save As" a "Web Archive" or mht file. Since all the graphics are embedded within these files in a similar way to the way MIME is used in HTML eMails you can easily move them around, rename them etc, etc. This is IMHO a much underused and useful format.
There are also a number of 3rd party utilities that do a whole range of things with webpages and websites.
Fruss Tray Ted
07-30-2003, 05:12 PM
I don't know why you want to 'Save Web Page' more than once, I usually have the problem of duplicate shortcuts and webpages scattered in different locations adding to my clutter and disorganization! :eek:
It sounds to me like you are trying to save a duplicate page (and accompanying folder) into the same master folder or directory. I have several generations of folders, subfolders and subs of the subs, I'm not sure HOW deep I have them, never did look to see. But I have no problem as you state. I can copy paste the pages into ANY of the folders I choose so long as it is not in the folder it was originally in.
The pics you lose may be the fact that you are not copying the associated folder that is supposed to accompany the webpage in the folder of your choice. As far as the lost links, the only time I've lost those is when the page was no longer valid or kept up on a server. Or just the links were, not sure, and/or it varies from time/page to time/page.
I tried a recently saved webpage and could not duplicate your problem so if I am out of line please excuse me.
_________________________________
Experiment:
I went to the test page you suggested, saved the page (HTML) in a folder I created called 'Test 1' in My Documents. Then went to the webpage again and chose 'Save As' (HTML) again and sasved it in My documents in another folder called 'Test 2'. No prob. Links work, page opens, don't get any 'Do you want to ovewrite' queries, I'm scratching my head here... ;)
What OS?
http://www.fleetwoodmac.net/penguin/covers/mystery.jpg
Paul Komski
07-30-2003, 05:20 PM
FTT I think that what Mike was alluding to was that if you had opened your Test1 and then saved IT as Test2 (while offline and rather than as a second save directly from the web) you might duplicate his "problem".
"Rumours" of a Fleetwood Mac comeback, could just be "Rumours" of course! :D
mike2002
07-30-2003, 06:38 PM
Paul: You understand what I mean. You noted that I said "--- Then go off-line, load the page from your disk."
Yes I have used the "Save As" a "Web Archive" option but, as I pointed out, for some reason the subsequent files take up a lot more disk space.
Fruss Tray Ted: Some simple things take a lot of explaining, things that you could show someone in seconds if they were sitting right beside you.
You stated "don't know why you want to 'Save Web Page' more than once." Reason was given above. Additionally some pages have extremely long original names. But after re-saving any webpage, yes you then Delete the original.
Regarding your 'test'; were you saving the said page separately as both a File and a Directory?
If I save any page simply as just a File, yes it can be renamed it as many times as one wishes, as it doesn't have to re-saved in order to so. It is with the format of both File AND Directory that I am querying. If you re-save a page into the same folder location, Windows will ask if you want it overwritten, But if you change the name you can save the same page as many times as you care to - each with a different name. Am I complicating things un-necessarily? (laugh)
I have just opened up a webpage from 'My Document'. I then clicked on File/SaveAs/Web Page, complete, and re-saved it to the Desktop. Sure enough it was there. But when I try the identical thing with the 'test' example I get the screenshot below, except if I save it as a single file WITHOUT an accompanying Directory.
This brings us back to my basic query - why won't this page re-save when (most) others do?
Paul Komski
07-30-2003, 07:15 PM
You could always try something like WebZip (http://download.com.com/3000-2377-10121937.html?tag=list).
Are the mhts larger than the htmls plus their associated files??
There is an unusual association between "Saved As" pages and their accompanying files; eg deleting just the htm or the folder cannot be done in isolation; deleting one deletes all.
If the htm is copied into an html editor and the associated files imported or copied in separately then all the linkages between the two can be maintained and things named just as one likes.
Using a program like WebZip parses the various webpages it reads and then rebuilds them such that the pages' "appearance" remains unchanged.
alanr
08-01-2003, 06:09 PM
Mike2002
FYI
__________________________________________________ ___________________
If you'd care to try an example by re-saving a page, try this URL:
http://computing.kelkoo.co.uk/b/a/s...r/I/111601.html
After the page loads, save it as 'Web Page, complete'. Then go off-line, load the page from your disk, then try saving it again. Whether you re-name it or keep the original name makes no difference but, like me, you'll find that it refuses to be re-saved.
__________________________________________________ ____________________
I just saved the above mentioned webpage to my hard disk, set IE to work offline (as I am permanently connected to the Internet), loaded the page from my hard disk and then resaved it as a different files name. I had no problem in doing so!!!
mike2002
08-01-2003, 07:59 PM
alanr: That's strange, because I have encountered quite a few web pages that throw up the above error message whenever I attempt to re-save them.
Additionally I had a certain page that wouldn't back up to a CD-RW as one of the files inside the directory contained more characters than the permissable limit would allow. Another reason for wanting to re-save a page with a shorter file name.
When you save a page as 'Web Page, complete', it gives you a File plus a Directory. The directory name has the additional '_files' after its name. You would assume that you could re-name the File part to whatever, then re-name the directory, adding the '_files' bit. However, that method doesn't work. When you load the page, you will find that the directory doesn't link to the File, and it displays minus any graphics.
Have you encountered any pages that won't save at all, except as HTML or Text File only?
alanr
08-02-2003, 07:42 AM
Yesterday when I successfully downloaded and renamed the webpage I was using my work computer Win2000 Pro connected to a LAN that is permanently connected to the Internet. All went well as I posted in an above reply.
This morning I used my home computer using win98SE and connected to the Internet via a dial up modem. I then downloaded the same file and saved it to my hard disk. The webpage file saved and so did a separate directory containing all the graphics. I then disconnected my modem from the Internet closed and reopened my IE web browser. I then opened the saved webpage and it displayed showing all graphics. I then resaved the webpage using s different name. I then closed my web browser and reopened it and loaded the web page with the new name. All went well, no problems at all.
I can say that I have never experienced the problem you are having and I have saved many a webpage onto my hard disk. I only wish I knew how to advise you...
Regards,
Alanr
mike2002
08-02-2003, 08:22 AM
Yes you are doing exactly as I have done.
Currently the contents of My Documents is 23,034 files and 1,746 folders - a grand total of 253Mb. I'm keep attempting to do a drastic 'prune'!!
I don't have this peculiarity with the majority of files saved, and it led me to believe that it lay with the pages themselves. Maybe my installation has become currupted in some way. I'll ask my son to perform this function and see what result he has.
Thanks for your interest.
Paul Komski
08-02-2003, 10:05 AM
Using WinXPsp1 and IE6 I get the same results as Mike2002 with http://computing.kelkoo.co.uk/b/a/sbs/uk/graphicsCards/name/radeon+9000/type/Graphics+Card/manufacturer/I/111601.html
"Re-Saving As" fails and produces the "cannot save to location" message. In my instance it does this whether or not I remain on line.
If the source of the webpage, after it has been saved to hdd, is viewed/edited with either Notepad or FrontPage and then immediately SavedAs a new htm file (in the same directory as the original associated files folder) then there is no problem!
Well there is a problem in that you have to retain the original name of the files folder, or else the linkage to all the images in it are lost. The original htm file must be renamed before deleting it or else it will also delete the associated files folder.
This makes it seem to be an IE6 browser related issue. Which version of it are you using alanr? - mine is the raw IE6 from a clean install of XP to which SP1 alone had been added.
There's quite a lot of html on that page and including a lot of javascript - so can't exclude an issue in there - but cant be bothered to wade through it all.
Importing the page and then the associated folder into FrontPage works nicely because either the page or the folder can be moved around or renamed and all the hyperlinks and associations are automatically updated.
mike2002
08-02-2003, 10:38 AM
Paul. To quote you:
"Re-Saving As" fails and produces the "cannot save to location" message. In my instance it does this whether or not I remain on line.
Exactly what happens to me. In fact I did another test by downloading the page from the web and saving it to my Desktop. Then, while still on-line, I opened it from the Desktop and re-saved it with the same name. It didn't work. Did you notice how it goes through the motions of saving, then fails at the last instant. I grabbed a screenshot and managed to catch it just before the fail message appeared.
alanr
08-02-2003, 12:12 PM
I am using IE 5.5
It reminds me, I am taking on online course and had problems downloading the course material and saving it to my hard disk. At the time I was using IE 6.0 I believe. They told me that many of their students had problems using IE 6.0 and recommended me to go back to IE 5.0 or 5.5. So I agree with Paul, it is most likely a bug with IE 6.0.
pave_spectre
08-03-2003, 08:52 AM
Just curious: Have you tried opening the pages in notepad as source code and then trying to save as a different named file from there?
Paul Komski
08-03-2003, 10:50 AM
It worked for me:
If the source of the webpage, after it has been saved to hdd, is viewed/edited with either Notepad or FrontPage and then immediately SavedAs a new htm file (in the same directory as the original associated files folder) then there is no problem!
... but you are left with the problem of the folder containg the pics etc and which is "tied" to its original "parent" htm file.
alanr
08-03-2003, 06:02 PM
When I tried to open the file using notepad it said file was too large and it then opened in wordpad. I then saved the file as a text file. I could not figure out how to save it as an html file using wordpad. I then opened a DOS window and renamed the file with an "htm" extension. Then I opened the file using IE 5.5 and the webpage opened correctly.
Question: How do you save a file as an HTML file using Wordpad? Is there a way? The "save as" drop down menu does not give this option.
Paul Komski
08-03-2003, 06:21 PM
In the Save As Type box first choose one of the text options.
Then in the file name just give it the .htm extension by replacing .txt with .htm eg:- yourfile.htm
Paul Komski
12-01-2003, 03:54 PM
I was messing with some .mht files earlier today and this experimentating let to what I believe is the answer to this thread.
When a page is saved as a web page (complete), the html file and the folder containing the relevant files are both created side by side in the designated location.
The way that such html files link to any css files within the associated folder lies at the heart of the problem as far as I can tell.
Open the saved page in IE and then View Source. Find any link tags in the source headers that point to such a css file. eg:- <link rel="stylesheet" type="text/css" href="foldername/cssdivs.css"> and then cut them out and paste it or them into notepad for the time being. Save the changes and Refresh IE.
Now you should be able to save as WebPage (complete) and change the name as you wish - though you will have lost the css formatting for the time being. To get the formatting back paste back the link tag(s) you cut out earlier - edited appropriately if needs be.
As far as I can tell it is only css files in the folder that cause this problem. Images and js dont seem to affect things. Perhaps other than css files will cause this behaviour too. What let me to this discovery was mht files which I discovered can embed the pictures and so on but cant embed the css! It was a short step to see that this was related to the behaviour of saving complete web pages - and then I remembered and found this thread.
;)
PS
The relevant tag in the Kelkoo link above (after it has been saved to disk) is <LINK
href="Radeon 9000 - Buy at cheap prices from the best shops at Kelkoo - Graphics Cards_files/kelkoov51.css"
type=text/css rel=stylesheet>
It appears twice - so must be cut out twice. Then the page can be re-saved with any name you like.
mike2002
12-02-2003, 12:30 PM
Paul: Well done, but involving a lot of work in order to find a solution..
One can also, in this instance anyway, delete the kelkoov5 & kelkoov51 .css files entirely without detriment. It still leaves you with the graphics, the links work, what more could you ask for?
vBulletin v3.6.1, Copyright ©2000-2012, Jelsoft Enterprises Ltd.