Talk:Make a backup file

⚠️ Warning: This is a draft ⚠️

This means it might contain formatting issues, incorrect code, conceptual problems, or other severe issues.

If you want to help to improve and eventually enable this page, please fork RosettaGit's repository and open a merge request on GitHub.

== Assumes unix ==

This task assumes unix, for example: "keep in mind symlinks". (So presumably it's legal to use os-specific code that would fail on windows?) --[[User:Rdm|Rdm]] 19:43, 9 November 2011 (UTC) Then again, it specifies "avoid external commands" so presumably this means that libc should not be used from non-C languages? I'm not sure which requirements take precedence over which other requirements here. --[[User:Rdm|Rdm]] 19:46, 9 November 2011 (UTC) :It looks like later versions of Windows [[wp:Symbolic_link#Windows_7_.26_Vista_symbolic_link|do have symbolic links]], but I never used symbolic links very much on any platform, so I don't know how important the operational differences are. --[[User:Mwn3d|Mwn3d]] 19:50, 9 November 2011 (UTC) ::Interesting, and they seem to work. And the "It is assumed" part of this task presumably means that if the program needs administrative privileges, it can be assumed to have them (I do not know what privileges are needed by default to rename or delete these links but administrator is needed by default to create them). --21:53, 9 November 2011 (UTC) ::: yes, i don't want to introduce another set of complications. and i certainly hope that links in windows can be manipulated without admin privileges though. but actually, the link itself does not need to be manipulated, only the target of the link.--[[User:EMBee|eMBee]] 03:11, 10 November 2011 (UTC) :the task is written towards unix because that's all i know. :-) if you can help make it more generic, then i'd appreciate that. one of the points is to work out quirks when you deal with renaming files that actually point to a different location. : libc may be used if it is linked to your language runtime or into the executable. what should be avoided is things like exec or popen which fork an external command, and even more so try not to use things like system which execute a shell to run the string you pass. the last one might make the task be a mere wrapper around any [[UNIX Shell]] entry. : the reason for avoiding exec is that in some environments executing other commands has its own set of problems. path issues, security, or even simply availability, and also portability. (a solution that executes mv is going to be harder to port to windows than one that uses some libc rename function)--[[User:EMBee|eMBee]] 03:11, 10 November 2011 (UTC) :: You did not address dynamic loading at all (dlopen, ...). Also, you did not give enough detail for me to decide what to do about the case where file rename is implemented in utility which comes with the language and which uses mv on unix and MoveFileW from kernel32 on windows. --[[User:Rdm|Rdm]] 14:31, 10 November 2011 (UTC) ::: i think dlopen is ok, i am not sure though, one question i am interested in is: can i deploy a program without relying on dependencies that i can not control? what if a user wants to run the program in a chroot or jail environment where mv might be missing. dlopen would be ok here only if the library to be opened somehow comes with the language, like it is a standard dependency of the language (as opposed to a dependency of just this task). this partly answers the case where the language implements rename using mv because in such a case mv would also be a standard dependency of the language. although i still would like to prefer a version that doesn't rely on external tools and processes. ::: in any case, if you can not avoid running an external process or if you dlopen a 3rd party library then please point this out in the description.--[[User:EMBee|eMBee]] 17:07, 10 November 2011 (UTC) :::: Hmm... this warrants some thought: libc is the portable (documented) interface to the unix kernel. It can hypothetically be a static library but that is extremely rare nowadays -- almost everything requires an external libc. That said, there's also the question of "which libc", and the one used at build time is probably the right answer to that question (there will be a hard coded path in the executable which references libc for almost every working program out there in unix land). Or, that's how I would like to characterize the problem. And I think this thinking routinely applies in most all chroots. --[[User:Rdm|Rdm]] 17:14, 10 November 2011 (UTC) ::::: sure, libc is dynamically linked in most cases, but only in a few cases would you access it manually with dlopen.--[[User:EMBee|eMBee]] 17:40, 10 November 2011 (UTC) :::::: Ok, but one of those cases would be an interpreter which was designed to be portable across a variety of platforms. Here, you might have a core that gets you running and then everything else is done in the interpreter. That said, I can see an argument for providing special case support for libc on unix platforms. --[[User:Rdm|Rdm]] 17:53, 10 November 2011 (UTC) ::::::: as i said above, if the core language implements rename/move using external commands then those commands become a direct dependency of the language and are ok to use. presumably such a language will rely on external commands for other things as well and thus it doesn't make much sense to avoid one and leave the others. in a situation where external commands are not allowed by policy, such a language would not be usable anyways. the limitations should only apply to languages where a portable method is not already available and different options could be chosen. in that case the choice should be made according to the restrictions given.--[[User:EMBee|eMBee]] 03:11, 11 November 2011 (UTC) :::::::: Note that "rename is atomic" [http://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows assumes unix] (or maybe a recent version of windows and an appropriate file system). --[[User:Rdm|Rdm]] 14:14, 14 November 2011 (UTC) ::::::::: true, but it is only stated as an advantage not a requirement for this task. even without being atomic rename is cheaper and thus less likely to fail...--[[User:EMBee|eMBee]] 14:42, 14 November 2011 (UTC) :::::::::: Note that this still assumes unix -- here's some examples illustrating this point: http://stackoverflow.com/questions/7147577/programmatically-rename-open-file-on-windows and http://stackoverflow.com/questions/1261269/how-to-open-file-in-windows-while-not-blocking-its-renaming --[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 21:33, 17 May 2013 (UTC)

== why external commands are bad ==

the motivation to avoid external commands can be illustrated by an experience i had just recently: on a website a framework uses cvs to manage changes to the contents. yesterday i wanted to add something to that site, and i was presented with this error: ''fork() failed with ENOMEM. Out of memory?''. draw your own conclusions...--[[User:EMBee|eMBee]] 03:18, 11 November 2011 (UTC)

== Why no copying? ==

Backup involves copying, and must do since otherwise it is the same file and will be modified by the subsequent update. (Or alternatively it has to have some very special support from the OS; there's no POSIX operation for “checkpoint this file to this other name without copying” IIRC.) The whole strength of backups comes from copying. –[[User:Dkf|Donal Fellows]] 15:42, 11 November 2011 (UTC) : This depends on the OS and on the pattern of accesses applications use on the file. Under unix, if anything has the file open for writing, then renaming it means they will update the backup. :: this is of course a concern, but only if multiple processes deal with a file which is not the concern of this task. also if a copy of a file is made while another process writes to it the the problems are not any less.--[[User:EMBee|eMBee]] 16:06, 11 November 2011 (UTC) : But if everything uses the "rename and write new copy" system, then it can be safe (though, of course, there's also the issue of more recent backups overwriting older backups). --[[User:Rdm|Rdm]] 15:48, 11 November 2011 (UTC) :It seems faster to just rename the file. With copying it goes like this: create the new file (.backup), copy the contents of the old file to the new file, clear the old file, write new data to the old file. Without it goes like this: rename the old file to a new name (.backup), create the old file again (already empty), write new data to the newly created file (with the old name). --[[User:Mwn3d|Mwn3d]] 15:52, 11 November 2011 (UTC) :good question. thanks for asking. copying is more expensive than rename. copying can fail (due to lack of space for example). if the machine dies before the copied file is written to the disk, which may be some time after the OS signaled that the copy is complete, and you already started to write the to the old file, then both may be lost. rename guarantees that the data is not touched, and thus can hardly be corrupted. and i don't think a rename could cause a file to be deleted if the machine crashes while a rename happens. it's either got the old name or the new one (or in very obscure situations maybe both). as far as i can tell, rename() is posix. at least the rename(2) manpage makes that claim. it is atomic too...--[[User:EMBee|eMBee]] 16:06, 11 November 2011 (UTC)

== No existing file ==

"Some examples on this page assume that the original file already exists. They might fail if some user is trying to create a new file." So, is it a task requirement that solutions should simply create a new file if there is no existing file? That is not something I would read into "In this task you should create a backup file from an existing file..." If this case is desired it should be added as one of the bullet points. —[[User:Sonia|Sonia]] 23:11, 16 February 2012 (UTC) :if the file does not exist, it should not be created. it would be nice if the code would fail gracefully if the file is missing, but i don't think this is necessary. it is just a code snippet to solve a particular problem. i'd expect developers to adapt the code if their situation is slightly different. [[Ensure that a file exists]] solves that part for example. no need to repeat it here.--[[User:EMBee|eMBee]] 03:07, 17 February 2012 (UTC) ::Oh good. I'll change the solutions I just posted. —[[User:Sonia|Sonia]] 03:13, 17 February 2012 (UTC)

== Follow symlinks ==

FWIW, following symlinks seems like a really bad idea. It's fine in the context of something like Emacs (which sounds like a possible motivation) with a feature to visit files under their "real" names, but in those cases the user is usually aware of the new name via the UI (as in getting a different buffer name). But for a script this is just wrong, since you get a script which works in a way that can change in the presence of symlinks -- and the whole point of symlinks is to get things to work even when a file is elsewhere. I think that it would be better to simplify this by ignoring symlinks completely, and introduce a separate task for resolving symlinks. --[[User:Elibarzilay|Elibarzilay]] ([[User talk:Elibarzilay|talk]]) 21:01, 17 May 2013 (UTC)

::Indeed. A simple application reading/writing files should not normally care (or check) if they are reading/writing via symlinks. The Go code for example is broken since it blindly assumes that any symlink doesn't point to another symlink. There are far too many ways to screw it up unless you really know what you're doing and you really understand symlinks (and how any specific user might choose to use them and want them to behave). IMO it shouldn't be an applications job to make file backups at all (except perhaps as an optional "feature" of an editor or some such; and for example editors like vim have a lot of options related to this so it will do what a user wants; assuming you can blindly lookup where a symlink points to and mess around in that directory is just bad). —[[User:dchapes|dchapes]] ([[User talk:dchapes|talk]] | [[Special:Contributions/dchapes|contribs]]) 14:00, 6 September 2014 (UTC)