History log of /linux-master/fs/nilfs2/segbuf.c
Revision Date Author Comments
# f7aeb97a 22-Jan-2024 Ryusuke Konishi <konishi.ryusuke@gmail.com>

nilfs2: convert segment buffer to use kmap_local

In the segment buffer code used for log writing, a CRC calculation routine
uses the deprecated kmap_atomic(), so convert it to use kmap_local.

Link: https://lkml.kernel.org/r/20240122140202.6950-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 679bd7eb 08-Jun-2023 Ryusuke Konishi <konishi.ryusuke@gmail.com>

nilfs2: fix buffer corruption due to concurrent device reads

As a result of analysis of a syzbot report, it turned out that in three
cases where nilfs2 allocates block device buffers directly via sb_getblk,
concurrent reads to the device can corrupt the allocated buffers.

Nilfs2 uses sb_getblk for segment summary blocks, that make up a log
header, and the super root block, that is the trailer, and when moving and
writing the second super block after fs resize.

In any of these, since the uptodate flag is not set when storing metadata
to be written in the allocated buffers, the stored metadata will be
overwritten if a device read of the same block occurs concurrently before
the write. This causes metadata corruption and misbehavior in the log
write itself, causing warnings in nilfs_btree_assign() as reported.

Fix these issues by setting an uptodate flag on the buffer head on the
first or before modifying each buffer obtained with sb_getblk, and
clearing the flag on failure.

When setting the uptodate flag, the lock_buffer/unlock_buffer pair is used
to perform necessary exclusive control, and the buffer is filled to ensure
that uninitialized bytes are not mixed into the data read from others. As
for buffers for segment summary blocks, they are filled incrementally, so
if the uptodate flag was unset on their allocation, set the flag and zero
fill the buffer once at that point.

Also, regarding the superblock move routine, the starting point of the
memset call to zerofill the block is incorrectly specified, which can
cause a buffer overflow on file systems with block sizes greater than
4KiB. In addition, if the superblock is moved within a large block, it is
necessary to assume the possibility that the data in the superblock will
be destroyed by zero-filling before copying. So fix these potential
issues as well.

Link: https://lkml.kernel.org/r/20230609035732.20426-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+31837fe952932efc8fb9@syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/00000000000030000a05e981f475@google.com
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# b9b1335e 22-Mar-2022 NeilBrown <neilb@suse.de>

remove bdi_congested() and wb_congested() and related functions

These functions are no longer useful as no BDIs report congestions any
more.

Removing the test on bdi_write_contested() in current_may_throttle()
could cause a small change in behaviour, but only when PF_LOCAL_THROTTLE
is set.

So replace the calls by 'false' and simplify the code - and remove the
functions.

[akpm@linux-foundation.org: fix build]

Link: https://lkml.kernel.org/r/164549983742.9187.2570198746005819592.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nilfs]
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# fbe7c2ef 22-Feb-2022 Christoph Hellwig <hch@lst.de>

nilfs2: pass the operation to bio_alloc

Refactor the segbuf write code to pass the op to bio_alloc instead of
setting it just before the submission.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Link: https://lore.kernel.org/r/20220222154634.597067-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 07888c66 24-Jan-2022 Christoph Hellwig <hch@lst.de>

block: pass a block_device and opf to bio_alloc

Pass the block_device and operation that we plan to use this bio for to
bio_alloc to optimize the assignment. NULL/0 can be passed, both for the
passthrough case on a raw request_queue and to temporarily avoid
refactoring some nasty code.

Also move the gfp_mask argument after the nr_vecs argument for a much
more logical calling convention matching what most of the kernel does.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f0d91192 24-Jan-2022 Christoph Hellwig <hch@lst.de>

nilfs2: remove nilfs_alloc_seg_bio

bio_alloc will never fail when it can sleep. Remove the now simple
nilfs_alloc_seg_bio helper and open code it in the only caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 94ee1d91 08-Nov-2021 Ryusuke Konishi <konishi.ryusuke@gmail.com>

nilfs2: remove filenames from file comments

Remove filenames that are not particularly useful in file comments, and
suppress checkpatch warnings

WARNING: It's generally not useful to have the filename in the file

Link: https://lkml.kernel.org/r/1635151862-11547-3-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Qing Wang <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a8affc03 10-Mar-2021 Christoph Hellwig <hch@lst.de>

block: rename BIO_MAX_PAGES to BIO_MAX_VECS

Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been
horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop
confusing users of the bio API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 64820ac6 26-Jan-2021 Christoph Hellwig <hch@lst.de>

nilfs2: remove cruft in nilfs_alloc_seg_bio

bio_alloc never returns NULL when it can sleep.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# a1d0747a 11-Aug-2020 Joe Perches <joe@perches.com>

nilfs2: use a more common logging style

Add macros for nilfs_<level>(sb, fmt, ...) and convert the uses of
'nilfs_msg(sb, KERN_<LEVEL>, ...)' to 'nilfs_<level>(sb, ...)' so nilfs2
uses a logging style more like the typical kernel logging style.

Miscellanea:

o Realign arguments for these uses

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1595860111-3920-4-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ae98043f 04-Sep-2018 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: convert to SPDX license tags

Remove the verbose license text from NILFS2 files and replace them with
SPDX tags. This does not change the license of any of the code.

Link: http://lkml.kernel.org/r/1535624528-5982-1-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# fb04b91b 06-Feb-2018 Arnd Bergmann <arnd@arndb.de>

nilfs2: use time64_t internally

The superblock and segment timestamps are used only internally in nilfs2
and can be read out using sysfs.

Since we are using the old 'get_seconds()' interface and store the data
as timestamps, the behavior differs slightly between 64-bit and 32-bit
kernels, the latter will show incorrect timestamps after 2038 in sysfs,
and presumably fail completely in 2106 as comparisons go wrong.

This changes nilfs2 to use time64_t with ktime_get_real_seconds() to
handle timestamps, making the behavior consistent and correct on both
32-bit and 64-bit machines.

The on-disk format already uses 64-bit timestamps, so nothing changes
there.

Link: http://lkml.kernel.org/r/20180122211050.1286441-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 74d46992 23-Aug-2017 Christoph Hellwig <hch@lst.de>

block: replace bi_bdev with a gendisk pointer and partitions index

This way we don't need a block_device structure to submit I/O. The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open. Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device. But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 4e4cbee9 03-Jun-2017 Christoph Hellwig <hch@lst.de>

block: switch bios to blk_status_t

Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>


# feee880f 02-Aug-2016 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: reduce bare use of printk() with nilfs_msg()

Replace most use of printk() in nilfs2 implementation with nilfs_msg(),
and reduce the following checkpatch.pl warning:

"WARNING: Prefer [subsystem eg: netdev]_crit([subsystem]dev, ...
then dev_crit(dev, ... then pr_crit(... to printk(KERN_CRIT ..."

This patch also fixes a minor checkpatch warning "WARNING: quoted string
split across lines" that often accompanies the prior warning, and amends
message format as needed.

Link: http://lkml.kernel.org/r/1464875891-5443-5-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b2d45866 05-Jun-2016 Mike Christie <mchristi@redhat.com>

nilfs: use bio op accessors

Separate the op from the rq_flag_bits and have nilfs
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@fb.com>


# 4e49ea4a 05-Jun-2016 Mike Christie <mchristi@redhat.com>

block/fs/drivers: remove rw argument from submit_bio

This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.

Signed-off-by: Mike Christie <mchristi@redhat.com>

Fixed up fs/ext4/crypto.c

Signed-off-by: Jens Axboe <axboe@fb.com>


# 0c6c44cb 23-May-2016 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: avoid bare use of 'unsigned'

This fixes checkpatch.pl warning "WARNING: Prefer 'unsigned int' to
bare use of 'unsigned'".

Link: http://lkml.kernel.org/r/1462886671-3521-5-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 4b420ab4 23-May-2016 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: clean up old e-mail addresses

E-mail addresses of osrg.net domain are no longer available. This
removes them from authorship notices and prevents reporters from being
confused.

Link: http://lkml.kernel.org/r/1461935747-10380-5-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5726d0b4 23-May-2016 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: remove FSF mailing address from GPL notices

This removes the extra paragraph which mentions FSF address in GPL
notices from source code of nilfs2 and avoids the checkpatch.pl error
related to it.

Link: http://lkml.kernel.org/r/1461935747-10380-4-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b54ffb73 19-May-2015 Kent Overstreet <kent.overstreet@gmail.com>

block: remove bio_get_nr_vecs()

We can always fill up the bio now, no need to estimate the possible
size based on queue parameters.

Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[hch: rebased and wrote a changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>


# 4246a0b6 20-Jul-2015 Christoph Hellwig <hch@lst.de>

block: add a bi_error field to struct bio

Currently we have two different ways to signal an I/O error on a BIO:

(1) by clearing the BIO_UPTODATE flag
(2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario. Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>


# b25de9d6 24-Apr-2015 Christoph Hellwig <hch@lst.de>

block: remove BIO_EOPNOTSUPP

Since the big barrier rewrite/removal in 2007 we never fail FLUSH or
FUA requests, which means we can remove the magic BIO_EOPNOTSUPP flag
to help propagating those to the buffer_head layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>


# 4f024f37 11-Oct-2013 Kent Overstreet <kmo@daterainc.com>

block: Abstract out bvec iterator

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6


# 4bf93b50 22-Aug-2013 Vyacheslav Dubeyko <slava@dubeyko.com>

nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection

Fix the issue with improper counting number of flying bio requests for
BIO_EOPNOTSUPP error detection case.

The sb_nbio must be incremented exactly the same number of times as
complete() function was called (or will be called) because
nilfs_segbuf_wait() will call wail_for_completion() for the number of
times set to sb_nbio:

do {
wait_for_completion(&segbuf->sb_bio_event);
} while (--segbuf->sb_nbio > 0);

Two functions complete() and wait_for_completion() must be called the
same number of times for the same sb_bio_event. Otherwise,
wait_for_completion() will hang or leak.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 2df37a19 22-Aug-2013 Vyacheslav Dubeyko <slava@dubeyko.com>

nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error

Remove double call of bio_put() in nilfs_end_bio_write() for the case of
BIO_EOPNOTSUPP error detection. The issue was found by Dan Carpenter
and he suggests first version of the fix too.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7b9c0976 25-Nov-2011 Cong Wang <amwang@redhat.com>

nilfs2: remove the second argument of k[un]map_atomic()

Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Cong Wang <amwang@redhat.com>


# 6c6de1aa 30-Apr-2011 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: super root size should change depending on inode size

The size of super root structure depends on inode size, so
NILFS_SR_BYTES macro should be a function of the inode size. This
fixes the issue.

Even though a different size value will be written for a possible
future filesystem with extended inode, but fortunately this does not
break disk format compatibility.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 1cb2d38c 03-Apr-2011 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: get rid of private page allocator

Previously, nilfs was cloning pages for mmapped region to freeze their
data and ensure consistency of checksum during writeback cycles. A
private page allocator was used for this page cloning. But, we no
longer need to do that since clear_page_dirty_for_io function sets up
pte so that vm_ops->page_mkwrite function is called right before the
mmapped pages are modified and nilfs_page_mkwrite function can safely
wait for the pages to be written back to disk.

So, this stops making a copy of mmapped pages during writeback, and
eliminates the private page allocation and deallocation functions from
nilfs.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 721a9602 09-Mar-2011 Jens Axboe <jaxboe@fusionio.com>

block: kill off REQ_UNPLUG

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>


# 026a7d63 06-Oct-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: get rid of bdi from nilfs object

Nilfs now can use sb->s_bdi to get backing_dev_info, so we use it
instead of ns_bdi on the nilfs object and remove ns_bdi.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 7b6d91da 07-Aug-2010 Christoph Hellwig <hch@lst.de>

block: unify flags for struct bio and struct request

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>


# 50614bcf 10-Apr-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: insert checkpoint number in segment summary header

This adds a field to record the latest checkpoint number in the
nilfs_segment_summary structure. This will help to recover the latest
checkpoint number from logs on disk. This field is intended for
crucial cases in which super blocks have lost pointer to the latest
log.

Even though this will change the disk format, both backward and
forward compatibility is preserved by a size field prepared in the
segment summary header.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 41c88bd7 05-Apr-2010 Li Hong <lihong.hi@gmail.com>

nilfs2: cleanup multi kmem_cache_{create,destroy} code

This cleanup patch gives several improvements:

- Moving all kmem_cache_{create_destroy} calls into one place, which removes
some small function calls, cleans up error check code and clarify the logic.

- Mark all initial code in __init section.

- Remove some very obvious comments.

- Adjust some declarations.

- Fix some space-tab issues.

Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# aaed1d5b 22-Mar-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: move out checksum routines to segment buffer code

This moves out checksum routines in log writer to segbuf.c for
cleanup.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 1e2b68bf 22-Mar-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: move pointer to super root block into logs

This moves a pointer to buffer storing super root block to each log
buffer from nilfs_sc_info struct for simplicity.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# d067633b 22-Mar-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: fix imperfect completion wait in nilfs_wait_on_logs

nilfs_wait_on_logs has a potential to slip out before completion of
all bio requests when it met an error. This synchronization fault may
cause unexpected results, for instance, violative access to freed
segment buffers from an end-bio callback routine.

This fixes the issue by ensuring that nilfs_wait_on_logs waits all
given logs.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 55480a06 13-Mar-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: remove spaces before tabs

This kills the following checkpatch warnings:

WARNING: please, no space before tabs
#74: FILE: segment.h:74:
+^Iunsigned ^I^Iflags;$

WARNING: please, no space before tabs
#35: FILE: segbuf.c:35:
+^Iint ^I^I^Istart, end; /* The region to be submitted */$

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 7a65004b 13-Mar-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: fix various typos in comments

This fixes various typos I found in comments of nilfs2.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 6c477d44 13-Mar-2010 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: fix discrepancy in use of static specifier

Two segbuf functions, nilfs_segbuf_write and nilfs_segbuf_wait, are
declared with the static storage class specifier, but their
implementations are not.

This fixes the discrepancy.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# d1c6b72a 16-Dec-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: move iterator to write log into segment buffer

This moves iterator to submit write requests for a series of logs into
segbuf.c, and hides nilfs_segbuf_write() and nilfs_segbuf_wait() in
the file.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# a694291a 29-Nov-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: separate wait function from nilfs_segctor_write

This separates wait function for submitted logs from the write
function nilfs_segctor_write(). A new list of segment buffers
"sc_write_logs" is added to hold logs under writing, and double
buffering is partially applied to hide io latency.

At this point, the double buffering is disabled for blocksize <
pagesize because page dirty flag is turned off during write and dirty
buffers are not properly collected for pages crossing over segments.

To receive full benefit of the double buffering, further refinement is
needed to move the io wait outside the lock section of log writer.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# e29df395 29-Nov-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: add iterator for segment buffers

This adds a few iterator functions for segment buffers to make it easy
to handle multiple series of logs.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 9c965bac 28-Nov-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: hide nilfs_write_info struct in segment buffer code

Hides nilfs_write_info struct and nilfs_segbuf_prepare_write function
in segbuf.c to simplify the interface of nilfs_segbuf_write function.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 9284ad2a 24-Nov-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: relocate io status variables to segment buffer

This moves io status variables in nilfs_write_info struct to
nilfs_segment_buffer struct.

This is a preparation to hide nilfs_write_info in segment buffer code.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 5f1586d0 29-Nov-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: do not return io error for bio allocation failure

Previously, log writer had possibility to set an io error flag on
segments even in case of memory allocation failure.

This fixes the issue.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# c1b353f0 19-Jun-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: use GFP_NOIO for bio_alloc instead of GFP_NOWAIT

Alberto Bertogli advised me about bio_alloc() use in nilfs:
On Sat, 13 Jun 2009 22:52:40 -0300, Alberto Bertogli wrote:
> By the way, those bio_alloc()s are using GFP_NOWAIT but it looks
> like they could use at least GFP_NOIO or GFP_NOFS, since the caller
> can (and sometimes do) sleep. The only caller is nilfs_submit_bh(),
> which calls nilfs_submit_seg_bio() which can sleep calling
> wait_for_completion().

This takes in the comment and replaces the use of GFP_NOWAIT flag with
GFP_NOIO.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 30bda0b8 16-May-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: set bio unplug flag for the last bio in segment

This sets BIO_RW_UNPLUG flag on the last bio of each segment during
write. The last bio should be unplugged immediately because the
caller waits for the completion after the submission.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# 071cb4b8 16-May-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: eliminate removal list of segments

This will clean up the removal list of segments and the related
functions from segment.c and ioctl.c, which have hurt code
readability.

This elimination is applied by using nilfs_sufile_updatev() previously
introduced in the patch ("nilfs2: add sufile function that can modify
multiple segment usages").

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>


# cece5520 06-Apr-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: simplify handling of active state of segments

will reduce some lines of segment constructor. Previously, the state was
complexly controlled through a list of segments in order to keep
consistency in meta data of usage state of segments. Instead, this
presents ``calculated'' active flags to userland cleaner program and stop
maintaining its real flag on disk.

Only by this fake flag, the cleaner cannot exactly know if each segment is
reclaimable or not. However, the recent extension of nilfs_sustat ioctl
struct (nilfs2-extend-nilfs_sustat-ioctl-struct.patch) can prevent the
cleaner from reclaiming in-use segment wrongly.

So, now I can apply this for simplification.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 64b5a32e 06-Apr-2009 Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

nilfs2: segment buffer

This adds the segment buffer which is used to constuct logs.

[akpm@linux-foundation.org: BIO_RW_SYNC got removed]
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>