Android NDK之fseek, lseek
关于文件seek有一系列函数,在stream上操作的fseek, fseeko,在file descriptor上操作的lseek, lseek64等。下面是几个函数原型:
int fseek(FILE *stream, long offset, int whence); int fseeko(FILE *stream, off_t offset, int whence); off_t lseek(int fd, off_t offset, int whence); off64_t lseek64(int fd, off64_t offset, int whence);
?
对于2G以上的大文件,如果想直接seek到2G之后的位置,offset就得大于2G,这就要求offset必须是64位的类型。
从上面offset的类型来看,有三种:long,off_t,off64_t
off_t和off64_t是typedef出来的新类型,明显off64_t肯定是64位的,就是说lseek64是肯定支持大文件的。
对于long和off_t,fseeko的man手册有下面一段话:
#define _FILE_OFFSET_BITS 64
will turn off_t into a 64-bit type.
对于off_t,只要加一个宏编译参数,就可以让它变成64位。
对于long,其长度取决于系统和编译器,32位平台下,long是32位,64位平台下,long可能是32位,也可能是64位,这个取决于编译器。
总结一下就是,fseek在32位平台下无法支持大文件,64位平台下可能支持大文件(取决于编译器);fseeko和lseek可以通过宏参数设置,使其支持大文件;lseek64从函数名就可以看出来,它使支持大文件的。
?
上面都是针对Linux,现在我们来说Android。
Android上,fseek是无法支持大文件的,fseeko和lseek呢,设置了宏 _FILE_OFFSET_BITS之后,还是不行,google之后发现原来Android不支持啊。https://code.google.com/p/android/issues/detail?id=64613
鉴于这个网页不太方便打开,这里把内容贴出来:
This is arguably a dupe of Issue #55866 (NDK: Missing large file support), but that bug is still in NotEnoughInformation, so lets provide more information...
The NDK currently declares e.g.
extern off_t lseek(int, off_t, int);
extern off64_t lseek64(int, off64_t, int);
While this provides "large file support", it does not go as far as glibc does. On "proper" Linux, it‘s more complicated; if _FILE_OFFSET_BITS is set to 64, then the "normal" file I/O functions are 64-bit -- lseek(2) would take a 64-bit off_t, not a 32-bit off_t.
http://users.suse.com/~aj/linux_lfs.html
> In a nutshell for using LFS you can choose either of the following:
> * Compile your programs with "gcc -D_FILE_OFFSET_BITS=64". This forces all
> file access calls to use the 64 bit variants.
(See also e.g. glibc <features.h> and <unistd.h> which has lots of fun/complicated #if-fu.)
Many open-source libraries will use autoconf‘s AC_SYS_LARGEFILE macro (or variants thereof) in order to check for and enable 64-bit off_t on 32-bit platforms, effectively resulting in:
#define lseek lseek64
http://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/System-Services.html
The problem with the NDK is that it doesn‘t support any of these patterns. Consequently, software built to support _FILE_OFFSET_BITS=64 behavior won‘t be built with this support enabled (because the NDK doesn‘t support it), resulting in use of the 32-bit file APIs instead of the 64-bit APIs.
https://code.google.com/p/android/issues/detail?id=55866#c6
> I have the same problem with my ffmpeg build. Can‘t open video files larger than 2GB.
What would be useful is for the NDK to implement/conform to the current glibc macros/patterns so that software with large file support can easily make use of it on Android.
Jan 9, 2014 Project Member #1 e...@google.com
we don‘t actually have the full set of *64 functions yet either, but we‘re working on it.
Summary: implement _FILE_OFFSET_BITS (was: NDK: Missing "GNU compatible" large file support.)
Owner: e...@google.com
Cc: e...@google.com cfer...@google.com
Nov 6, 2014 #2 wolfgang...@gmail.com
This issue hit me recently in migrating some code that was safe using autoconf (AC_SYS_LARGEFILE) and I tried (paranoid) adding in `#define _FILE_OFFSET_BITS 64` all over the place.
Finally realized with a very small test program that Android does not respect `#define _FILE_OFFSET_BITS 64` or the autoconf macro as expected.
This led to a maddening bug that was hard to track down as core expectations were not correct.
Is this non-conformance documented anywhere?
?
好了,就只有lseek64了,好在Android支持这个。
但是怎么用呢,之前的代码全是用的fopen, fread, fseek, ftell系列的函数,好在有fileno这个函数。
int fileno(FILE *stream);
?这个函数把stream转成file?descriptor。
?
下面封装出自己的支持64位的fseek函数,注意fseek和lseek64的返回值。
int fseek_64(FILE *stream, int64_t offset, int origin) { int fd = fileno(stream); if (lseek64(fd, offset, origin) == -1) return errno; return 0; }
就这样,it works。
但是程序跑了一段时间后,发现有些不正常,一路追踪下来,bug锁定在了我们自己写的fseek_64函数。
具体表现是,用fread读了一些数据,然后fseek_64,接下来再fread,发现读到的数据不是我们期望的,在我们想要的数据前面,总是有一些脏数据,于是猜想是不是fread有缓存,脏数据就是缓存中未读完的数据呢?为什么fseek就没有问题,而我们的调用了lseek64的fseek_64却有问题?fseek和lseek之间有什么联系,又有什么区别?于是google之,证实了我的猜想。
首先,fread/fwrite系列函数在实现时确实是使用了缓存的。而lseek是系统调用,fseek是标准c库,它的底层实现也是调用了lseek,但是同时对缓存做了相应处理。比如,假设缓存中有10字节的数据,这时要往后跳4字节,这是fseek不需要调用lseek,只要把缓存的指针往后挪4个字节就ok了;如果要往后跳40字节呢,fseek就调用lseek,跳到指定位置,然后把缓存清空。
我们的fseek_64实现里面,只调用lseek64跳到了指定的地方,而没有去操作缓存,所以导致了上面的bug。
这里又要用到sefbuf函数来操作缓存。于是修改fseek_64函数如下:
int fseek_64(FILE *stream, int64_t offset, int origin) { setbuf(stream, NULL); //清空buffer int fd = fileno(stream); if (lseek64(fd, offset, origin) == -1) return errno; return 0; }
?这样改过之后,上面那个bug就没了。
?
最后,还有个问题,在做项目的过程中发现,对于已经到达EOF的stream,使用lseek是不能让stream再次可读的。不知道fseek函数有没有处理这个,如果有处理的话(目前感觉这种可能性很大),我们的fseek_64函数应该继续改进,使用rewind甚至重新打开文件,来使其再次可读。如果是这样,代码应改成这样:
int fseek_64(FILE *stream, int64_t offset, int origin) { if (feof(stream)) { rewind(stream); } else { setbuf(stream, NULL); //清空fread的缓存 } int fd = fileno(stream); if (lseek64(fd, offset, origin) == -1) { return errno; } return 0; }
?后面验证过了再来更新验证结果。
?
update:
验证过了,fseek可以使已经EOF的stream重新可读,rewind也可以。所以,上面的代码是可以工作的。
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。