从 std::list 中 size() 的时间复杂度引出的讨论...
File I/O 效率 C vs C++ (一)

fstream 文件 IO 点滴

輝夜(tadvent) posted @ 2009年10月06日 18:24 in C/C++ with tags VC STL c++ gcc string fstream , 4659 阅读

很多时候较大数据量的文件 IO 总是成为瓶颈,为了提高效率,有时想要先将文件大块大块的读入再行处理。下面分析两种惯常的处理手法。

1. 将文件一次性读入 string 中。

貌似 std::getline 、 istream::getline 或是 operator<< operator>> 等都不提供一次读到文件结尾的机制,只有 istreambuf_iterator 可以做到:

ifstream in("input.txt");
string instr((istreambuf_iterator<char>(in)), istreambuf_iterator<char>());

string 的构造函数前一个参数要多加一层 () 以免编译器误认为是函数声明 = = ...

这样读入 string 会随着内容动态增长,空间不足时会触发额外的 realloc 及 copy 操作,为提高效率有必要预分配足够的空间:

ifstream in("input.txt");
in.seekg(0, ios::end);
streampos len = in.tellg();
in.seekg(0, ios::beg);

string instr;
instr.reserve(len);
instr.assign(istreambuf_iterator<char>(in), istreambuf_iterator<char>());

2. 将文件一次性读入 stringstream 中。

filebuf 和 stringbuf 无法直接通过 rdbuf() 重定向,因此从 filebuf 到 stringbuf 需要一次 copy 操作。最简单的方法是直接复制整个 streambuf :

ifstream in("input.txt");
stringstream ss;
ss<<in.rdbuf();

与 string 的情况相同,这里同样也有一个空间 realloc 及 copy 的问题。但 streambuf 的缓冲区不是那么方便操作的,解决方法是我们给他手动指定一个空间:

ifstream in("input.txt");
in.seekg(0, ios::end);
streampos len = in.tellg();
in.seekg(0, ios::beg);

vector<char> buffer(len);
in.read(&buffer[0], len);

stringstream ss;
ss.rdbuf()->pubsetbuf(&buffer[0], len);

最后再顺便 BS 一下 VC 的 STL = =...

虽然 VC 的编译器效率没的说,但被 STL 拖后腿的话不就白搭了嘛。在文件 IO 方面 (fstream) 比起 MinGW (GCC 4.4.0) 带的要慢好几倍。GCC 的 fstream 格式化读写效率与 C 的比已经不分伯仲,以后应该还会有进一步的提升空间 (编译时格式控制 vs 执行时)

另外上面最后一段程序在 VS2008 (VC9.0) 下应该无法得到预想的结果,跟踪进去看了一下,VC 标准库里的 pubsetbuf 函数体居然是空的!内容如下(中间还有一层函数调用):

virtual _Myt *__CLR_OR_THIS_CALL setbuf(_Elem *, streamsize)
        {       // offer buffer to external agent (do nothing)
        return (this);
        }

看来是等着我们来继承了啊 = = 。而在 MinGW (GCC 4.4.0) 中可以得到预期的结果。

Nagaland 11th Questi 说:
2022年8月16日 21:29

The Nagaland 11th grade Important Question Paper 2023 was released in May of the previous year, and the exams were held in March. The 11th grade examinations were held in March this year, as they were the year before, and we anticipate that the May announcement of the Nagaland 11th Important Question Paper 2023 will be the same. Nagaland 11th Question Paper 2023 Therefore, following the release of the Nagaland 11th Model Paper, all students who took these examinations and are currently anticipating the NBSE +2 Important Question Paper 2023 of these exams may check their Nagaland 11th Important Question Paper 2023 by utilising the official web portal site of this board.

Emma 说:
2022年12月29日 19:54

In C++, the fstream file IO library provides various functions for reading from and writing to files. One of the most commonly used is the `getline` function, which reads a line of text from a file and stores it in a string Lab grown diamonds object. Other commonly used functions include `read`, `write`, and `seekg`, which allow you to read or write a certain number of bytes from a file, or seek to a specific position in a file, respectively.


登录 *


loading captcha image...
(输入验证码)
or Ctrl+Enter