I'm running Redhat 5 using ext4 and notice that every day I see in the /var/log/messages this error message:
kernel: EXT4-fs (dm-5): Unaligned AIO/DIO on inode 230305773 by oracle; performance will be poor
It only logs this message once per day. It also will generate this error just after I start up the database. I have a ticket logged for over a month with Oracle support and they want me to log a ticket with Redhat on this. I don't understand how this is an O/S issue when clearly the error happens when the database starts up. On top of that, I don't see this error with any ext3 filesystems.
There is an article I found that mentions that it is an application issue: http://askubuntu.com/questions/118140/unaligned-aio-dio
Has anyone else seen this error?
A couple of thoughts.
1. The message is just a message ... is there something that prevents you from ignoring it?
2. Are there any measurable performance issues?
3. What are the values of the database initialization parameters for Async and Direct I/O? Do you want us to guess whether you are using them?
4. Do you want us to guess as to your database version?
5. Have you done as Oracle Support requested and opened a ticket with RedHat? If not why?
1. The message is just a message ... is there something that prevents you from ignoring it? I believe if you read the link I provided you, it would answer this question.
"So it means the program in question is trying to use the asynchronous I/O or direct I/O APIs with memory buffers that aren't aligned to the file system block boundaries, which forces the file system to perform the async operations on the file in series to avoid corruption."
2. Are there any measurable performance issues? We do have performance issues but we can't tell if it is related to queries or this issue.
3. What are the values of the database initialization parameters for Async and Direct I/O? Do you want us to guess whether you are using them? disk_async_io is true and filesystem_aio is set to SetAll
4. Do you want us to guess as to your database version? 188.8.131.52
5. Have you done as Oracle Support requested and opened a ticket with RedHat? Again the link I pastes indicates that this is an issue with the software and not the operating system.
Edited by: mwadmin on Sep 10, 2012 7:29 AM
Thanks I will give those options a try. While it may help, I don't think it's going to fix this error but will report back.
Edit: While reviewing the options only one seems to be safe to put in and that is the noatime. This won't really resolve the error message.
Edited by: mwadmin on Sep 17, 2012 2:09 PM
There are significant performance increases to going with ext4. Not only that but it provides error checking on the fly so that you don't need to take any downtime to run periodic file system checks. Taking a 4-8 hour outage for a storage array quarterly is a lot of downtime.
It's not an error, it's a warning. What you need to do is determine if there is any performance impact due to the unaligned AIO/DIO requests. Based on my knowledge of lgwr I/O I suspect that there isn't any. But I haven't actually ever tried to verify this. You'll have to do that yourself. If for example lgwr I/O is significantly faster if you don't use async and direct I/O, then there may be a reason to look into this further. Otherwise I'd just live with the daily message and not worry about it.
I am also getting this message in the systems logs (RHEL 6.2, Oracle 184.108.40.206, EXT4). the inodes are actually referring to the redo log files (set a of all the 4 sets from 1a - 4a), whereby the mirror copy of redo logs (1b - 4b) doesn't make any noise.All the redo logs resides in the same filesystem. The message appears everyday once and the generation time coincides with the DB startup time.
Edited by: paulsk on Nov 19, 2012 2:00 AM
Edited by: paulsk on Nov 19, 2012 2:02 AM
We haven't made much progess with Oracle. I am going to install Oracle Unbreakable Linux and have them sort this mess with Redhat. All I'll be doing is getting in the middle of this between the two vendors. Of course Oracle denied they are at fault even though this only happens when using ext4.
There really isn't anything that will be done about this, either in the ext4 code or in the database code, unless someone can demonstrate that there is a real problem here, other than a daily warning message. And even then the solution is going to be to not use ext4, but use ASM instead.
Here is what development said in my SR:
This is a warning built into some versions of ext4 about a potential
performance penalty when using undersized block I/O. The penalty is only
really applicable (read: significant) when performed on unwritten blocks,
either unallocated ones or not-yet-zeroed ones.
This happens because redo log files use a block size defined by the sector
size of the disk, which on many disks in 512 bytes, while the block size of
ext4 is 4K. Using the disk sector size for redo log files is important for
recoverability purposes and should not be changed.
Note that redo log files are preformatted during creation, meaning no
unwritten blocks, and are written with a single thread, meaning no
concurrency issues, so the warning is not really meaningful. It will be
delivered daily because it is triggered for any unaligned/undersized write,
not just ones where an actual problem is likely to develop.
Therefore the results are expected and the warning should be ignored. If the
issue needs correction at all, it is to apply a version of Linux where the
warning is removed. (Bug 14096480 seems to claim this was done for Oracle's
Linux release V2.6.39-400.11.0, but it's not clear if it happened.)
So it appear that no one is fixing this issue in Redhat 5 or 6. I wish I could tell my customers this with anything that they request. I still see this as an application issue. This is happening on redo logs that are not aligned correctly. The O/S is just reporting the issue.