2008-09-13 00:13 -!- MaZe(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #tux3 2008-09-13 00:13 test 2008-09-13 00:20 it didn't work 2008-09-13 00:33 ok, one nasty little issue with putting the atom tables in the atomdict... ext2_create_entry likes to rely on the inode->i_size to know how many dirent blocks there are 2008-09-13 00:34 now the poor thing thinks there are an awful lot of them 2008-09-13 00:52 so this being inside the filesystem, I can actually let it have stuff out past the end of i_size 2008-09-13 00:53 and let the dirops happilly continue using i_size to know how many dir blocks there are 2008-09-13 00:54 that's probably ok 2008-09-13 00:54 got to think about it 2008-09-13 00:55 now this is where I'd like somebody to be awake ;) 2008-09-13 00:55 I guess I'll go take a run around the pure track 2008-09-13 00:55 see if that focusses me 2008-09-13 01:10 -!- RazvanM(~RazvanM@pool-151-196-118-156.balt.east.verizon.net) has joined #tux3 2008-09-13 02:17 -!- Aks(~ankitsriv@123.237.71.198) has joined #tux3 2008-09-13 02:27 should have pinged if you needed someone awake 2008-09-13 02:28 maze, do you know what pread is supposed to do with read past end of file? 2008-09-13 02:29 I would think, return zero 2008-09-13 02:29 ah, I'll ptrace ;) 2008-09-13 02:29 brilliant 2008-09-13 02:29 you could just right a tiny program to test it 2008-09-13 02:30 my guess is pread should behave exactly like lseek read 2008-09-13 02:31 probably EINVAL 2008-09-13 02:31 but testing would see 2008-09-13 02:33 it's supposed to return zero 2008-09-13 02:34 and in fact something else is going on 2008-09-13 02:34 so never mind I'll sort it 2008-09-13 02:34 how was spore? 2008-09-13 02:34 haven't gotten to it yet 2008-09-13 02:34 how's the new fs? 2008-09-13 02:34 going through open tabs in firefox and closing them 2008-09-13 02:34 right 2008-09-13 02:34 lots of stuff from the week to catch up on before I reboot 2008-09-13 02:34 yah 2008-09-13 02:34 I'm about 30% through... 2008-09-13 02:35 that's why it's good that firefox crashes 2008-09-13 02:35 it does down, you lose 150 tabs, you find out they didn't matter, you get hours of your life back 2008-09-13 02:36 nope it restarts with all the tabs 2008-09-13 02:36 thankfully 2008-09-13 02:36 I actually always killall firefox-bin instead of closing it 2008-09-13 02:36 ok, got it sorted 2008-09-13 02:36 that way the tabs don't get lost ;-) 2008-09-13 02:36 so how is it? 2008-09-13 02:36 it's because I running 32 bit fileops 2008-09-13 02:36 shifted the block number, passed zero to pread 2008-09-13 02:37 everything makes sense 2008-09-13 02:37 well, It's a little tricky to test this high offset sparse file stuff 2008-09-13 02:38 I'll find a way around it 2008-09-13 02:38 it still seems like a good thing to do 2008-09-13 02:40 there we are, compiled with 64 bit r/w and got the proper error 2008-09-13 02:40 % 2008-09-13 02:40 5 2008-09-13 02:40 EIO 2008-09-13 02:48 hmm, I don't think there's really any need to test in 32bit userspace if the kernels 64 bit, right? 2008-09-13 02:48 all the conversions happen way earlier at the syscall entry point... 2008-09-13 02:49 all combinations need testing eventually 2008-09-13 02:49 but true, you can live blissfully in 64 bit 2008-09-13 02:49 never worry about 32 bit 2008-09-13 02:50 yes. all combos need testing, but much later 2008-09-13 02:50 right 2008-09-13 02:50 I don't think that's something that needs worrying in the dev stage 2008-09-13 02:50 I'll stay in 32 bit 2008-09-13 02:50 it's the most demanding 2008-09-13 02:50 32 bit kernel? 2008-09-13 02:50 yes 2008-09-13 02:50 ah 2008-09-13 02:50 shapor runs 64 bit 2008-09-13 02:50 most others do 2008-09-13 02:50 so do I 2008-09-13 02:50 but if it doesn't work on 32 bit it doesn't exist 2008-09-13 02:50 I actually run an interesting system 2008-09-13 02:51 45 bit fedora 8.5 2008-09-13 02:51 45? 2008-09-13 02:51 yeah, a joke 2008-09-13 02:51 haha 2008-09-13 02:51 I run a 23 db system 2008-09-13 02:51 it's a 32-bit fedora 8 system, with 64-bit kernel installed, some stuff to support that, and a 64-bit compiler, than upgraded to fedora 9, than more stuff upgraded to 64-bits 2008-09-13 02:51 that's the most important thing about it from my point of view 2008-09-13 02:51 wait, sorry 2008-09-13 02:52 it's 28 db 2008-09-13 02:52 hmm... 2008-09-13 02:52 let me see what it really is 2008-09-13 02:52 mine's a macbook pro laptop, so it's almost silent unless I run the procs full throttle 2008-09-13 02:52 I also have a headless (well with projector) box, which is also very quiet, since it's a shuttle 2008-09-13 02:53 anyway since my laptop is mostly 32 bits, I felt avg of 32 + 64 = 48 was not right 2008-09-13 02:53 so instead went with geometric mean sqrt(32 * 64) = sqrt(2) * 32 = 1.4 *32 = 32 + 12.8 = 44.8 ~ 45 2008-09-13 02:54 and that's why it's a 45 bit fedora 8.5 system 2008-09-13 02:54 brilliant logic, ain't it? 2008-09-13 02:55 right 2008-09-13 02:55 29 dba @ 1 meter 2008-09-13 02:55 ok night 2008-09-13 02:55 and it's still too noisy for me 2008-09-13 02:55 I want 27 now 2008-09-13 02:55 hard to get 2008-09-13 02:55 flips: good luck with design and coding as usual :) 2008-09-13 02:55 heh, that's why I use a laptop, and a headless box 2008-09-13 02:55 this is quieter than any laptop I've used 2008-09-13 02:55 the box can be on the other end of the room 2008-09-13 02:55 considerably 2008-09-13 02:56 really? even when you run with no cpu consumption on a laptop and spun down drives? 2008-09-13 02:56 spun down ok 2008-09-13 02:56 the hard drive is the noisiest component 2008-09-13 02:56 and I got the quiest ones on the market 2008-09-13 02:56 remember my laptop is basically a remote X/xterm/ssh server 2008-09-13 02:56 quietest 2008-09-13 02:56 sure 2008-09-13 02:57 but the drive doesn't stay spun down 2008-09-13 02:57 yeah, I actually use flash for some things 2008-09-13 02:57 in my experience 2008-09-13 02:57 and don't like things suddenly going "whirr" either ;-) 2008-09-13 02:57 quiet means, makes no noise 2008-09-13 02:57 like the root fs part that is read only (8gb), with tons of the non-permanent stuff living in tmpfs (although no swap) 2008-09-13 02:58 ok, you're hardcore 2008-09-13 02:58 I'd expect no less 2008-09-13 02:58 that way I have ro root fs with 8 gb, tmpfs with /var pieces 2008-09-13 02:58 my favorite box here is the fit pc 2008-09-13 02:58 no fan 2008-09-13 02:58 has a quiet 2.5 in drive 2008-09-13 02:58 ah 2008-09-13 02:58 which will be replaced with a 2.5 in flash drive 2008-09-13 02:58 pretty soon 2008-09-13 02:59 my daughter's favorite box too 2008-09-13 02:59 I actually have my flash drive in a raid array with an identical size partition on the hard disk 2008-09-13 02:59 has a fine linux distro on it 2008-09-13 02:59 didn't have to do a thing 2008-09-13 02:59 with the hard drive part set to raid mode 'write_mostly' 2008-09-13 02:59 that way if you pull the flash it falls back to the hard drive 2008-09-13 02:59 than you can remount,rw 2008-09-13 02:59 update the system 2008-09-13 02:59 ok, now I have this little logical problem 2008-09-13 02:59 remount,ro 2008-09-13 02:59 put the flash back 2008-09-13 02:59 in 2008-09-13 03:00 and then 8gbs sync to flash in one go 2008-09-13 03:00 - presto wear levelling solved even with ext3 ;-) 2008-09-13 03:00 cool 2008-09-13 03:00 that's md? 2008-09-13 03:00 must be 2008-09-13 03:00 dm can't sync ;-) 2008-09-13 03:00 once both are synced, there are no writes (read-only mount), and the hdd is write-mostly, so all reads hit swap 2008-09-13 03:00 yup it's md 2008-09-13 03:00 erm 2008-09-13 03:01 not swap - flash 2008-09-13 03:01 ddraid project is going to get going pretty soon 2008-09-13 03:01 cluster raid 2008-09-13 03:01 been gathering dust for some time 2008-09-13 03:01 nice and quiet - and fast - since seek time is awesome 2008-09-13 03:01 but it's going to be really useful 2008-09-13 03:01 and linear read is pretty much the same as a normal 2.5 inch drive 2008-09-13 03:01 (25mb/s) 2008-09-13 03:02 which drive is it? 2008-09-13 03:02 and since it's an expresscard 8gb flash - it doesn't stick out of the notebook - just sits nestled within the cavity 2008-09-13 03:02 ah 2008-09-13 03:02 uhm, some no name I picked up off of ebay for like 40 bucks a year and a half back 2008-09-13 03:02 seen em at fry's 2008-09-13 03:03 need 32 gb I think 2008-09-13 03:03 and with the drive spun down, there's less heat - thus less need for fans 2008-09-13 03:03 I don't really want to do work on a smaller one 2008-09-13 03:03 remember - this is just the OS 2008-09-13 03:03 data lives in the cloud 2008-09-13 03:03 in this case on the headless box 2008-09-13 03:04 nice little cloud 2008-09-13 03:04 a cumullo closetus 2008-09-13 03:04 or when at work... it lives elsewhere ;-) 2008-09-13 03:04 ok, my logistical problem 2008-09-13 03:04 and of course email and so on, are already in the cloud to begin with 2008-09-13 03:04 I'm testing this gigantic sparse file stuff 2008-09-13 03:04 and my linux oss can't do those big files 2008-09-13 03:04 my fs 2008-09-13 03:05 let me see 2008-09-13 03:05 why not? 2008-09-13 03:05 what kernel? 2008-09-13 03:05 oss? 2008-09-13 03:05 I'm writing stuff at 2^40 bytes out 2008-09-13 03:05 was that a typo for os? 2008-09-13 03:05 yes 2008-09-13 03:05 uhm, compiling 32-bit userspace with 32-bit kernel? 2008-09-13 03:05 so, ext3 just can't do that 2008-09-13 03:05 does 34 bits work? 2008-09-13 03:05 right, 32/32 2008-09-13 03:05 no 2008-09-13 03:06 2^40, see? 2008-09-13 03:06 tux3 is 2^48 2008-09-13 03:06 tux3 can do it 2008-09-13 03:06 asking whether 34 works 2008-09-13 03:06 even mapped into a loopback file 2008-09-13 03:06 34 doesn't 2008-09-13 03:06 um 2008-09-13 03:06 wait 2008-09-13 03:06 then your problem is compile options 2008-09-13 03:06 no, 34 should be ok 2008-09-13 03:06 #define USE_LARGEFILEOFFSET 64 or so 2008-09-13 03:06 yep, done 2008-09-13 03:07 the problem is the loopback file 2008-09-13 03:07 so 34 works? 2008-09-13 03:07 well 2008-09-13 03:07 34 is 16gb 2008-09-13 03:07 that should definitely work 2008-09-13 03:07 see 2^40 above 2008-09-13 03:07 33 is 8gb - that I know works - since that's dvd images ;-) 2008-09-13 03:07 well, that's why I'm asking if you've tested with 33 2008-09-13 03:07 terabyte 2008-09-13 03:07 no 2008-09-13 03:07 I guess I will 2008-09-13 03:07 then probably worth checking ;-) 2008-09-13 03:07 but I can't leave it that way 2008-09-13 03:08 not satisfactory 2008-09-13 03:08 if it doesn't work at 33, your problem is not in the kernel 2008-09-13 03:08 sure, I can do 15 minutes of testing with a much lower offset 2008-09-13 03:08 or half an hour 2008-09-13 03:08 but I need to write real code 2008-09-13 03:08 if it works at 33, but not at 40, then you've got a kernel internal problem 2008-09-13 03:08 it works fine 2008-09-13 03:08 like I said, this is just logical 2008-09-13 03:09 flips: btw, mmLinux was my first kernel project ever 2008-09-13 03:09 wait a minute 2008-09-13 03:09 just so that you know 2008-09-13 03:09 I think 2 TB is the biggest sparse file I can make on this system 2008-09-13 03:09 we're talking about the size of a file right? 2008-09-13 03:09 I figured it was either going to make me or kil me 2008-09-13 03:09 bh, looks like a fine project 2008-09-13 03:09 right 2008-09-13 03:09 I've done well with the rap that it's given me 2008-09-13 03:09 yes 2008-09-13 03:09 that makes sense it's probably 4GB * 512 or something 2008-09-13 03:09 I didn't know about it at all 2008-09-13 03:09 however, I think I can do better 2008-09-13 03:10 maze, exactly 2008-09-13 03:10 lame 2008-09-13 03:10 it's the blocks count in the ext3 inode 2008-09-13 03:10 http://en.wikipedia.org/wiki/Ext3 2008-09-13 03:10 yeah, I was in the middle of the entire -rt thing and I got forgotten about, dropped out of existence 2008-09-13 03:10 see file size limit = 2tb 2008-09-13 03:10 measured in sectors instead of blocks, lamest idea ever 2008-09-13 03:10 unfixable 2008-09-13 03:10 apparently 2008-09-13 03:11 use jfs 2008-09-13 03:11 ok, so the correct way to test this is to boot tux3 up far enough that I can do the testing in tux3 files 2008-09-13 03:11 lol 2008-09-13 03:11 which go up to 2^48 (true) 2008-09-13 03:11 I wasn't joking 2008-09-13 03:11 been doing that already 2008-09-13 03:11 jfs does 4pb which is 52 bits 2008-09-13 03:11 for a few weeks even 2008-09-13 03:11 ah 2008-09-13 03:12 that's another option 2008-09-13 03:12 but it would limit people's abiltiy to test 2008-09-13 03:12 or xfs 2008-09-13 03:12 8 exabyte limit 2008-09-13 03:12 which is 63 bits 2008-09-13 03:12 it's important that tux3 builds and tests on the lowest common denominator linux system 2008-09-13 03:12 can't require a specific host fs 2008-09-13 03:12 you can if you stick it in a loopback 2008-09-13 03:12 I am 2008-09-13 03:13 that get's me up to 2 TB on ext3 2008-09-13 03:13 then you merely need to format the loopback with jfs 2008-09-13 03:13 can't expect users to do that 2008-09-13 03:13 oh, sparse loopback 2008-09-13 03:13 I thought base fs -> file -> loopback -> jfs -> sparse file -> loopback -> tux3 2008-09-13 03:13 but that does get complex 2008-09-13 03:13 and probably blows the stack 2008-09-13 03:13 can't expect the user even to have jfs 2008-09-13 03:14 it's not compiled in by default 2008-09-13 03:14 well 2008-09-13 03:14 default on fedora 2008-09-13 03:14 probably comes in modules on most recent distros 2008-09-13 03:14 ok 2008-09-13 03:14 and ubuntu too, in a module I expect 2008-09-13 03:14 but still 2008-09-13 03:14 module of course 2008-09-13 03:14 then they have to mess with a fragile loopback 2008-09-13 03:14 it's pi o'clock 2008-09-13 03:14 and really tux3 can do that by itself 2008-09-13 03:14 heh 2008-09-13 03:14 so it is 2008-09-13 03:15 what are you plans for october 31? 2008-09-13 03:15 uhm 2008-09-13 03:15 it's a friday 2008-09-13 03:15 halloween 2008-09-13 03:15 looks like it's a friday 2008-09-13 03:16 looks like none at the moment 2008-09-13 03:16 I'm thinking of arranging an official cabal meeting that day 2008-09-13 03:16 were? 2008-09-13 03:16 just an idea 2008-09-13 03:16 rather: where? 2008-09-13 03:17 somewhere in the fear and loathing of LA 2008-09-13 03:17 ugh, would need to drive down... 2008-09-13 03:17 with a web presence 2008-09-13 03:17 unless we organized it somewhere mid-way 2008-09-13 03:17 possible 2008-09-13 03:17 like I say, just an idea at the moment 2008-09-13 03:18 bay area certainly could be good for attendance 2008-09-13 03:18 where are you? 2008-09-13 03:18 santa monica 2008-09-13 03:18 socal 2008-09-13 03:18 something like santa maria 2008-09-13 03:19 close 2008-09-13 03:19 or grover beach 2008-09-13 03:19 I'm sure those saints all live near each other 2008-09-13 03:19 that was a suggestion of a mid-way meet point 2008-09-13 03:19 right next to the beach 2008-09-13 03:19 indeed 2008-09-13 03:19 oh 2008-09-13 03:19 I see 2008-09-13 03:20 it's outside the LA basin 2008-09-13 03:20 outside of LA jams 2008-09-13 03:20 so you deal with LA 2008-09-13 03:20 we norcal folks get a larger distance to travel 2008-09-13 03:21 I'd probably take highway 1 and leave 3 hours early - since I love that drive... but oh well ;-) 2008-09-13 03:21 150 miles on pch... 2008-09-13 03:21 yes, it's not too far 2008-09-13 03:21 anyway, just an idea 2008-09-13 03:21 that's 130 miles from you, 230 from me 2008-09-13 03:22 ok I know what I'll do 2008-09-13 03:22 something around there, some restaurant? 2008-09-13 03:22 about my logistical problem 2008-09-13 03:22 I'll make the position of the atom refcount map a variable in the superblock 2008-09-13 03:22 and set it really low for unit testing 2008-09-13 03:22 duh 2008-09-13 03:23 yes, something like that 2008-09-13 03:23 130 miles I can handle 2008-09-13 03:23 you're younger, can handle further 2008-09-13 03:23 I need to check with such folks as natalie 2008-09-13 03:23 we could do this as an LA-only event 2008-09-13 03:23 or try for larger coverage 2008-09-13 03:24 where's she from? 2008-09-13 03:24 ukraine 2008-09-13 03:24 lives in LA 2008-09-13 03:24 I brought her into goog 2008-09-13 03:24 looks like SM 2008-09-13 03:24 goog's lucky about that 2008-09-13 03:24 most likely 2008-09-13 03:24 we can patch you in ;-) 2008-09-13 03:25 patch as in? 2008-09-13 03:25 remotey 2008-09-13 03:25 that's another option 2008-09-13 03:25 to do it with a web presence 2008-09-13 03:25 oh, as in organize it in the office? to vc? 2008-09-13 03:25 the idea is to have some, um, ethanol involved 2008-09-13 03:26 not quite 2008-09-13 03:26 let's have an email loop 2008-09-13 03:26 cause I very heavily doubt I have the net uplink for any decent vc at home 2008-09-13 03:26 about it 2008-09-13 03:26 it's just comsucktik 2008-09-13 03:26 yes 2008-09-13 03:26 well 2008-09-13 03:42 just a question: do you test tux3 on 64bit? 2008-09-13 03:44 because it seems that all I get are error messages :) 2008-09-13 03:45 or maybe I need a newer fuse-version. Which one are you using? 2008-09-13 03:45 2.7.3 here 2008-09-13 03:53 flips is on 2.7.4 2008-09-13 03:53 and shapor and I are on 64 bit 2008-09-13 03:53 fuse version is set to 27 2008-09-13 03:53 hmm... I thought at least creating files should work under fuse, shouldn't it? 2008-09-13 03:53 in the source 2008-09-13 03:54 it should 2008-09-13 03:54 post your error? 2008-09-13 03:54 data, you have a web paste utility you use? 2008-09-13 03:55 well, there are a few, but not a single one 2008-09-13 03:55 which do you prefer? 2008-09-13 03:55 any 2008-09-13 03:55 just paste your output there 2008-09-13 03:55 and let konrad go at it ;-) 2008-09-13 03:55 heh 2008-09-13 03:56 I think tux3fuse is the toy to use 2008-09-13 03:56 I can't see tux3fs as being useful anymore with the low level one there 2008-09-13 03:56 desktop test # touch test 2008-09-13 03:56 desktop test # ls 2008-09-13 03:56 desktop test # 2008-09-13 03:56 nothing shows up 2008-09-13 03:56 konrad, I'll leave that question to you and shapor 2008-09-13 03:56 first error. 2008-09-13 03:56 data: at least it doesn't crash :) 2008-09-13 03:56 I think you're right but I'm not the expert 2008-09-13 03:57 pff, I first looked at fuse the day I sent that email 2008-09-13 03:57 that's more than me 2008-09-13 03:57 still haven't looked at it 2008-09-13 03:57 heh 2008-09-13 03:57 -su: echo: write error: Transport endpoint is not connected 2008-09-13 03:57 atom refcounting getting closer 2008-09-13 03:58 echo "foo" > bar 2008-09-13 03:58 means fuse didn't start 2008-09-13 03:58 run with -f 2008-09-13 03:58 that is, make defuse 2008-09-13 03:58 right. using that 2008-09-13 03:59 and it hangs like it should? 2008-09-13 03:59 anyway, you want to paste all the output 2008-09-13 03:59 there should be lots 2008-09-13 03:59 hm, I'm not getting tux3fuse to mount anything here 2008-09-13 03:59 http://www.nomorepasting.com/getpaste.php?pasteid=20198 2008-09-13 04:00 when I do: echo "foo" > bar 2008-09-13 04:00 the steps are something like: dd if=/dev/zero of=./dev seek=100M count=1; ./tux3 mkfs ./dev; ./tux3fuse dev tmp/ 2008-09-13 04:00 yes? 2008-09-13 04:00 oh, segment fault 2008-09-13 04:01 I'm not even getting that 2008-09-13 04:01 i just used make defuse 2008-09-13 04:01 you want to find where the segfault is 2008-09-13 04:01 tux3_init: fdsize64 failed for 'dev' (Bad file descriptor)! 2008-09-13 04:01 konrad's on it ;) 2008-09-13 04:01 nah I got to get to sleep 2008-09-13 04:02 and i have to do more algebra (bah!) 2008-09-13 04:02 and I'm moving in ~5 days so I may be otherwise occupied come this tuesday evening 2008-09-13 04:02 data, try: sudo gdb -args ./tux3fuse /tmp/testdev /tmp/test -f 2008-09-13 04:02 slight variation 2008-09-13 04:03 run under gdb 2008-09-13 04:03 0x0000000000406239 in xcache_limit (xcache=0x0) at tux3.h:284 2008-09-13 04:03 284 return (void *)xcache + xcache->size; 2008-09-13 04:03 yup it's a bug 2008-09-13 04:04 shall we chase it tomorrow? 2008-09-13 04:04 it's the middle of the day? :P 2008-09-13 04:04 oh right 2008-09-13 04:04 well 2008-09-13 04:04 data: we're in PST, it's 4am for flips and I :D 2008-09-13 04:04 we need to get a tux3 debug center going over there in europe 2008-09-13 04:04 i'll have a look at it if I find the time 2008-09-13 04:04 ok 2008-09-13 04:04 good luck 2008-09-13 04:04 it's just a bug ;-) 2008-09-13 04:04 but otherwise I'll be around tomorrow 2008-09-13 04:05 now you can run under gdb, makes it easier to chase 2008-09-13 08:21 -!- pgquiles(~pgquiles@229.Red-83-49-101.dynamicIP.rima-tde.net) has joined #tux3 2008-09-13 10:10 -!- Aks(~ankitsriv@123.237.71.198) has left #tux3 2008-09-13 12:22 well, xattr get/set actually seem to be an interesting method for extended operations on inodes 2008-09-13 13:26 maze, how is the new fs going? 2008-09-13 13:26 writing the makefile... 2008-09-13 13:27 that's most of the work, if your write the fs in "make" and use fuse 2008-09-13 13:27 you didn't sleep much 2008-09-13 13:28 starting from the makefile ;-) 2008-09-13 13:28 want something that compiles 2008-09-13 13:28 and I'm writing straight in kernel-space 2008-09-13 13:28 hardcore 2008-09-13 13:28 I'd like to have a build-debug environment 2008-09-13 13:29 good exercise 2008-09-13 13:29 hey, I'm not doing this to test a concept of a fs 2008-09-13 13:29 but to learn the API 2008-09-13 13:29 I know 2008-09-13 13:29 I'm excited about that 2008-09-13 13:29 you're probably going to be telling me about it in a week 2008-09-13 13:29 things I didn't know and should have ;) 2008-09-13 13:30 one would only hope... 2008-09-13 13:30 but right know, I'm not even getting it to compile a module ;-) 2008-09-13 13:30 I'm also expecting to hear some swearing in the channel 2008-09-13 13:30 the secret, the way everbody starts a new fs: cut and paste ramfs 2008-09-13 13:31 even lazier people cut and paste tux2 2008-09-13 13:31 sorry 2008-09-13 13:31 ext2 2008-09-13 13:31 I'm lazy - but I think that would be counter productive 2008-09-13 13:31 I'm starting with a clean slate, with ramfs/tmpfs/ext2 as cut-n-paste sources 2008-09-13 13:31 but planning on writing it all 2008-09-13 13:32 I want to understand every line of code 2008-09-13 13:32 and the only way to do that is to write it yourself... 2008-09-13 13:32 well - I've got a working makefile. 2008-09-13 13:32 of course it currently doesn't build any modules... 2008-09-13 13:32 ugh 2008-09-13 13:32 and use jon corbet's examples 2008-09-13 13:32 so maybe the definition of working is more like ' it doesn't report parse errors' 2008-09-13 13:32 there is a particularly good example from linux device drivers on building a minimal module 2008-09-13 13:35 http://lwn.net/Articles/21817/ 2008-09-13 13:35 enjoy 2008-09-13 13:35 hmm 2008-09-13 13:35 this was around the time rusty fscked with the module system and messed it all up 2008-09-13 13:38 don't neglect to have a close look at my use_atom code I just posted to the list, they way it handles the positive and negative carries between shorts might be interesting to you 2008-09-13 13:38 a form of bit bashing you don't see much these days 2008-09-13 13:38 clumsy in c 2008-09-13 13:40 okay, have a junkfs.ko 2008-09-13 13:41 well it loads into running kernel (yay for testing on machine you're working on) 2008-09-13 13:41 and unloads 2008-09-13 13:41 of course all it has is empty init/exit 2008-09-13 13:42 [maze@nike junkfs]$ make clean 2008-09-13 13:42 rm -f *~ *.o *.ko *.mod.c .*.cmd 2008-09-13 13:42 rm -f modules.order .depend .version .*.o.flags .*.o.d 2008-09-13 13:42 rm -rf .tmp_versions 2008-09-13 13:42 rm -f Module.markers Module.symvers 2008-09-13 13:42 [maze@nike junkfs]$ make 2008-09-13 13:42 make -C /lib/modules/2.6.26.3-29.fc9.x86_64/build SUBDIRS=/home/maze/junkfs modules 2008-09-13 13:42 make[1]: Entering directory `/usr/src/kernels/2.6.26.3-29.fc9.x86_64' 2008-09-13 13:42 CC [M] /home/maze/junkfs/super.o 2008-09-13 13:42 LD [M] /home/maze/junkfs/junkfs.o 2008-09-13 13:42 Building modules, stage 2. 2008-09-13 13:42 MODPOST 1 modules 2008-09-13 13:42 CC /home/maze/junkfs/junkfs.mod.o 2008-09-13 13:42 LD [M] /home/maze/junkfs/junkfs.ko 2008-09-13 13:42 make[1]: Leaving directory `/usr/src/kernels/2.6.26.3-29.fc9.x86_64' 2008-09-13 13:42 (reverse-i-search)`modp': modprobe ath_pci 2008-09-13 13:42 [maze@nike junkfs]$ /sbin/lsmod | egrep ju 2008-09-13 13:42 [maze@nike junkfs]$ sudo /sbin/insmod ./junkfs.ko 2008-09-13 13:42 [maze@nike junkfs]$ /sbin/lsmod | egrep ju 2008-09-13 13:42 junkfs 9856 0 2008-09-13 13:42 [maze@nike junkfs]$ sudo /sbin/rmmod junkfs 2008-09-13 13:42 [maze@nike junkfs]$ /sbin/lsmod | egrep ju 2008-09-13 13:42 [maze@nike junkfs]$ 2008-09-13 13:50 cat /proc/filesystems | grep junk 2008-09-13 14:02 ok, atom reverse mapping then we are done with atoms for a while 2008-09-13 14:04 ok, printk debugging v0.1 ready 2008-09-13 14:05 moving to v0.2 2008-09-13 14:05 @/home/maze/junkfs/super.c:26 - Entering: init_junk_fs() 2008-09-13 14:05 @/home/maze/junkfs/super.c:27 - Exiting: init_junk_fs() 2008-09-13 14:05 @/home/maze/junkfs/super.c:32 - Entering: exit_junk_fs() 2008-09-13 14:05 @/home/maze/junkfs/super.c:33 - Exiting: exit_junk_fs() 2008-09-13 14:05 registered it yet? 2008-09-13 14:05 guess not 2008-09-13 14:05 or you would have grepped 2008-09-13 14:06 that would be v0.0.2 I think 2008-09-13 14:06 well 2008-09-13 14:06 that's just me ;) 2008-09-13 14:07 you're building in your home directory, most hacks build right in a kernel tree 2008-09-13 14:07 so you can git the whole tree 2008-09-13 14:08 I don't even have the source for the kernel ;-) 2008-09-13 14:08 it's just the normal fedora core 9 kernel from koji 2008-09-13 14:08 leet 2008-09-13 14:08 sploit time 2008-09-13 14:08 no, working on making it verbose 2008-09-13 14:08 that is in fact how I started tux2 2008-09-13 14:09 worked with modules up until I realized I was going to be bringing down my workstation a lot 2008-09-13 14:09 well, next step will be kvm 2008-09-13 14:09 you're moving along 2008-09-13 14:10 well, first still have to get debugging more verbose 2008-09-13 14:10 it's not dumping function entry or exit values 2008-09-13 14:10 ltt? 2008-09-13 14:10 and after all - the entire point of this exercise is to learn the api 2008-09-13 14:10 which means seeing what gets passed in 2008-09-13 14:10 (and out) 2008-09-13 14:11 plus it makes debugging easier 2008-09-13 14:17 ok, I have to "unbundle" the ext2 dirops so I can find out which block and offset it created a new dirent at 2008-09-13 14:17 easiest way is to return the dirent and buffer I guess 2008-09-13 14:17 and to be able to search a given dirent block 2008-09-13 14:17 probably how it should have been written in the first place 2008-09-13 14:25 Current debug output: 2008-09-13 14:25 @/home/maze/junkfs/super.c:44 - Entering: init_junk_fs() 2008-09-13 14:25 @/home/maze/junkfs/super.c:39 - Entering: test(5, 6) 2008-09-13 14:25 @/home/maze/junkfs/super.c:40 - Exiting: test(...) = 0 2008-09-13 14:25 @/home/maze/junkfs/super.c:46 - Exiting: init_junk_fs(...) = 0 2008-09-13 14:25 @/home/maze/junkfs/super.c:50 - Entering: exit_junk_fs() 2008-09-13 14:25 @/home/maze/junkfs/super.c:51 - Exiting: exit_junk_fs(...) 2008-09-13 14:26 here's the code: 2008-09-13 14:26 static int test (int a, int b) { 2008-09-13 14:26 <------>DBG_ENTER2(int,a,int,b); 2008-09-13 14:26 <------>DBG_RETURN1(int,0); 2008-09-13 14:26 } 2008-09-13 14:26 static int __init init_junk_fs(void) { 2008-09-13 14:26 <------>DBG_ENTER0(); 2008-09-13 14:26 <------>test(5, 6); 2008-09-13 14:26 <------>DBG_RETURN1(int,0); 2008-09-13 14:26 } 2008-09-13 14:26 static void __exit exit_junk_fs(void) { 2008-09-13 14:26 <------>DBG_ENTER0(); 2008-09-13 14:26 <------>DBG_RETURN0(); 2008-09-13 14:26 } 2008-09-13 14:26 what are those funny minuses? 2008-09-13 14:26 tabs? 2008-09-13 14:26 oh, that's tabs 2008-09-13 14:27 dark blue on darker blue background, but they show up fine after pasting 2008-09-13 14:27 you need spaces after your commas or my head will explode getting yucky goo everywhere 2008-09-13 14:28 ok, you're ready to register/unregister your fs 2008-09-13 14:28 right, also need to force func declare and debug into the same line I think, and dedup it 2008-09-13 14:28 will total about 6 lines 2008-09-13 14:28 right. 2008-09-13 14:28 plus a couple for a stub fill_super 2008-09-13 14:28 well, and the filesystem_type decl ;-) 2008-09-13 14:28 it starts to bloat 2008-09-13 14:28 of course ;-) 2008-09-13 14:29 I'd go with the separate code and debug lines 2008-09-13 14:31 so, the following isn't better? 2008-09-13 14:31 DECLARE0(static,int,__init,init_junk_fs) 2008-09-13 14:31 <------>test(5, 6); 2008-09-13 14:31 <------>DBG_RETURN1(int, 0); 2008-09-13 14:31 } 2008-09-13 14:31 I'm not sure myself 2008-09-13 14:31 seperate lines means it's easier to turn off on a per function basis 2008-09-13 14:32 makes my eyes bleed 2008-09-13 14:32 if not sure, go with the unbundled form 2008-09-13 14:32 oh right spaces ;-) 2008-09-13 14:32 right 2008-09-13 14:32 separate lines rules the world for debug traces 2008-09-13 14:33 got to remember, we're writing in C, don't try to make it pretty, you will not succeed, and if it doesn't look ugly then the fates will not smile upon you 2008-09-13 14:34 yeah, but for return unless I use a temp variable, I kind of have to return from within the macro 2008-09-13 14:34 This is what test looks like now: 2008-09-13 14:34 DECLARE2(static, int, , test, int, a, int, b) 2008-09-13 14:34 DBG_RETURN1(int, 0); 2008-09-13 14:34 } 2008-09-13 14:35 hmm, requires a little thinking, and I'm hungry 2008-09-13 14:35 try ltt 2008-09-13 14:35 it does this for you 2008-09-13 14:35 what's ltt? 2008-09-13 14:35 so you can concentrate on the problem 2008-09-13 14:35 linux trace toolkit 2008-09-13 14:36 yes, google found it 2008-09-13 14:37 ok, I have a better idea than shelling ext2_create_entry to return buffer and dirent... instead of an error code, return the dir file pos 2008-09-13 14:37 or -1 if there was an error 2008-09-13 14:37 the only error ext2_create_entry returns anyway is -EIO 2008-09-13 14:37 just more braindamaged C style error handling, or lack of it 2008-09-13 14:37 I'm not making it worse, honest 2008-09-13 14:46 oh, right this is C 2008-09-13 14:46 can't declare vars in the middle of func body 2008-09-13 14:46 you can 2008-09-13 14:46 well 2008-09-13 14:47 you have to override the kernel compile flags 2008-09-13 14:47 so you don't get warnings 2008-09-13 14:47 we might build tux3 that way for a while 2008-09-13 14:47 folks 2008-09-13 14:48 until the squacking from old schoolers gets too much to bear 2008-09-13 14:48 I can't remember what the reason for not using C++ was in the kernel? 2008-09-13 14:48 was it the programmers? 2008-09-13 14:49 or were there actual issues with the compiler 2008-09-13 14:49 fear of exceptions and crazy hidden semantics 2008-09-13 14:49 no real issues 2008-09-13 14:49 I know you _can_ compile non-C++-std-compliant C++ without any libraries 2008-09-13 14:49 on, no designated initializers 2008-09-13 14:49 that's a killer 2008-09-13 14:50 even c99 is permabanned 2008-09-13 14:50 I probably know the feature, but I'm not sure what that refers to, is that the .something = something struct initializer 2008-09-13 14:50 for no reason whatsoever 2008-09-13 14:50 or the [d] = something 2008-09-13 14:50 right 2008-09-13 14:50 essential 2008-09-13 14:50 both? 2008-09-13 14:50 agreed, they're essential 2008-09-13 14:50 the former mostly 2008-09-13 14:50 I have used the second exactly once, and that was last week 2008-09-13 14:50 in tux3 2008-09-13 14:51 fear of exceptions is a good one, but you should just not use them ;-) 2008-09-13 14:51 it's mostly fear of hidden behavior 2008-09-13 14:51 linus hates that 2008-09-13 14:51 I've never seen a c++ prog that didn't have it 2008-09-13 14:51 hidden behaviour... is the C++ compiler more loose? 2008-09-13 14:51 way loose 2008-09-13 14:52 code generated is beyond pathetic compared to hand crafted C 2008-09-13 14:52 you can also write hand crafted c++ of course but nobody does 2008-09-13 14:52 even if you don't use code-killing features? 2008-09-13 14:52 (like exceptions, multiple inheritance, large parts of OO, etc) 2008-09-13 14:52 linuxers don't have that discpline 2008-09-13 14:52 [templates...] 2008-09-13 14:53 remember, 90%+ of linux is dodgy drivers written by people who wish it was saturday 2008-09-13 14:53 ah, so it really boils down to programmers 2008-09-13 14:53 I wish it was saturday 2008-09-13 14:53 (and it is!) 2008-09-13 14:53 me too 2008-09-13 14:53 right 2008-09-13 14:53 it's nice to be happy :) 2008-09-13 14:53 speaking of which 2008-09-13 14:53 nice when wishes come true 2008-09-13 14:53 nearly sk8 oclock 2008-09-13 14:53 and I'm nearly done with the atom revmap 2008-09-13 14:53 woohoo 2008-09-13 14:54 I need to include linux/fs.h apparently ;-) 2008-09-13 14:56 the fun starts 2008-09-13 14:58 well registering was simple 2008-09-13 14:59 of course there's no get_sb function declared... 2008-09-13 14:59 naturally 2008-09-13 14:59 now you can see it in proc 2008-09-13 14:59 a lot of stuff is happening 2008-09-13 15:00 that's where you realize the vfs is actually oo, even if it was developed by folks with very little understanding of oo 2008-09-13 15:00 which includes me ;-) 2008-09-13 15:00 though to be sure my role in vfs devel was minor 2008-09-13 15:01 mainly just contributed the inode specialization model 2008-09-13 15:02 ok, atom reverse entries are being created 2008-09-13 15:02 now lets reverse an atom 2008-09-13 15:02 I suppose I could use the readdir interface for this 2008-09-13 15:03 that would be kind of perverse 2008-09-13 15:03 no, sorry, really perverse 2008-09-13 15:03 nodev junkfs 2008-09-13 15:03 I just won't do that 2008-09-13 15:03 good 2008-09-13 15:03 now to take a look at the flags 2008-09-13 15:04 the difference between a nodev and a dev is significant ;-) 2008-09-13 15:04 have fun will kill_litter_super 2008-09-13 15:04 with 2008-09-13 15:04 well, right now none are declared, allthough it is easy to make it dev 2008-09-13 15:04 I'd like to see what is available 2008-09-13 15:04 kinda easy 2008-09-13 15:04 and kinda not 2008-09-13 15:04 you will see 2008-09-13 15:04 it's not as crystalline as you think right now 2008-09-13 15:06 /* public flags for file_system_type */ 94#define FS_REQUIRES_DEV 1 95#define FS_BINARY_MOUNTDATA 2 96#define FS_HAS_SUBTYPE 4 97#define FS_REVAL_DOT 16384 /* Check the paths ".", ".." for staleness */ 98#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() 99 * during rename() internally. 100 */ 2008-09-13 15:06 so requires dev, means a block device with the data, as opposed to just in mem 2008-09-13 15:06 I wonder if nfs needs dev 2008-09-13 15:06 probably not 2008-09-13 15:07 right 2008-09-13 15:07 nfs is nodev 2008-09-13 15:07 binary mount data is probably for nfs and smb because they use binary mount options and hence have special mount programs 2008-09-13 15:07 subtype - no idea 2008-09-13 15:07 good observation 2008-09-13 15:07 dev means "block dev" 2008-09-13 15:07 maybe something like fat 2008-09-13 15:07 is considered subtyped 2008-09-13 15:08 no idea either 2008-09-13 15:08 sounds like rot 2008-09-13 15:08 reval_dot seems especially useful for nfs, maybe others 2008-09-13 15:08 d_move - seems like something worth knowing 2008-09-13 15:08 although probably later on 2008-09-13 15:08 see, I never looked at all those flags 2008-09-13 15:08 worthwhile knowing there's an implementation option there 2008-09-13 15:08 they come and go 2008-09-13 15:08 not a stable api 2008-09-13 15:08 well, you have to develop for some api ;-) 2008-09-13 15:09 right 2008-09-13 15:09 internal kernel api is a moving target 2008-09-13 15:09 of course 2008-09-13 15:09 partly intentional to encourage out of tree people to merge 2008-09-13 15:09 partly to improve it 2008-09-13 15:10 probably worthwhile, although breakage for breakage sake should be frowned upon, if it's just okay to change stuff, but only for the 'better', than that's another issue 2008-09-13 15:11 ok, lets see which fs'es use which flags 2008-09-13 15:11 yeah, I'm needlessly thorough... but oh, well, can't change who and what I am 2008-09-13 15:11 that means I'll be able to ask you questions soon 2008-09-13 15:12 it's a feature, not a bug 2008-09-13 15:13 blockdev - tons, as expected - including nfsd (the server) 2008-09-13 15:13 although for nfsd it's actually checking you're exporting a fs with a dev backing 2008-09-13 15:13 wonder if that means you can't export ramfs 2008-09-13 15:14 oh because nfs uses the dev from cookie to lookup the fs 2008-09-13 15:15 ugh, broken 2008-09-13 15:15 [by design] 2008-09-13 15:16 although there's a hack to be able to re-export nfs mounts 2008-09-13 15:16 binary_mountdata 2008-09-13 15:16 we're going to have to copy this buffer and make it tux3 U #3 2008-09-13 15:17 coda, ncpfs (netware), nfs, smbfs/cifs 2008-09-13 15:17 so basically the complex net file systems 2008-09-13 15:17 all nodev? 2008-09-13 15:17 of course 2008-09-13 15:17 probably because they take so many options related to networking 2008-09-13 15:17 -> no binary_mountdata 2008-09-13 15:17 never looked at that 2008-09-13 15:18 the opposite of that is? 2008-09-13 15:18 subtype appears to be a fuse hack 2008-09-13 15:18 ah, right 2008-09-13 15:18 the opposite of binary_mountdata is not putting it in flags 2008-09-13 15:18 see my complaint about fuse on that topic, thursday 2008-09-13 15:18 all 'normal' filesystems use text string mount options 2008-09-13 15:18 wrong idea 2008-09-13 15:18 each fuse fs should get its own type 2008-09-13 15:18 not all of the "fuse" 2008-09-13 15:18 just wrong 2008-09-13 15:19 okay, skipping fuse parsing ;-) giving me a headache 2008-09-13 15:20 REVAL_DOT 2008-09-13 15:20 is nfs only 2008-09-13 15:20 related to parent directory entries of a path being able to go stale 2008-09-13 15:20 revalidate 2008-09-13 15:20 something which is related to nfs protocol borkenness 2008-09-13 15:20 yes 2008-09-13 15:20 subtle 2008-09-13 15:20 although maybe hard to fix in a new netfs 2008-09-13 15:20 no, not hard 2008-09-13 15:21 just has to be stateful 2008-09-13 15:21 right 2008-09-13 15:21 stateless is unworkable braindamage 2008-09-13 15:21 but you want it both stateful, and stateless 2008-09-13 15:21 a pox upon us 2008-09-13 15:21 I don't agree 2008-09-13 15:21 lightweight state 2008-09-13 15:21 that scales 2008-09-13 15:21 is good 2008-09-13 15:21 nfs is bad 2008-09-13 15:21 doesn't work properly 2008-09-13 15:21 you want lightweight statefull with fallback to stateless 2008-09-13 15:21 trond will disagree of course 2008-09-13 15:21 the fallback being mostly for the server reboot/failover case 2008-09-13 15:22 you never want stateless 2008-09-13 15:22 stateless == brainless 2008-09-13 15:22 :-) 2008-09-13 15:22 well, would have to think about it more... stateless has nice features that you do want 2008-09-13 15:22 a nematode is getting close to stateless 2008-09-13 15:22 metanode? 2008-09-13 15:23 heh 2008-09-13 15:23 is that an anagram? 2008-09-13 15:23 wow 2008-09-13 15:23 so which is it? 2008-09-13 15:23 don't start with puns now ;-) 2008-09-13 15:23 nematode: disgusting little worm 2008-09-13 15:23 right, but is there an fs concept called nematode? 2008-09-13 15:23 nfs: disgusting little hack that grew up into a huge disgusting little worm 2008-09-13 15:24 no 2008-09-13 15:24 just me dissing nfs 2008-09-13 15:24 wish trond were here ;-) 2008-09-13 15:24 so you just mistyped metanode? or you meant the worm 2008-09-13 15:24 seem - I'm clueless 2008-09-13 15:24 sarcasm/irony/human interaction just fly right over me 2008-09-13 15:24 no, I meant to type nematode, I was comparing nfs to a nematode 2008-09-13 15:24 s/seem/see/ 2008-09-13 15:25 both are nearly stateless 2008-09-13 15:25 wait a minute - nfs is stateless... sin't it? 2008-09-13 15:25 not quite 2008-09-13 15:25 lockd implements a stateful protocol 2008-09-13 15:25 it's fakery to pretend it doesn't 2008-09-13 15:25 right - those are extensions 2008-09-13 15:25 although 2008-09-13 15:26 to be fair running with out it doesn't happen 2008-09-13 15:26 also tcp 2008-09-13 15:26 can't really be separated 2008-09-13 15:26 not really 2008-09-13 15:30 nfs is actually like 4 fs'es 2008-09-13 15:30 2 being v3 vs v4 2008-09-13 15:30 and 2 being normal vs cross-device registration hackery 2008-09-13 15:30 so you have 2 * 2 = 4 2008-09-13 15:30 right 2008-09-13 15:30 anyway 2008-09-13 15:30 D_MOVE 2008-09-13 15:30 it's rather cleverly and lazily compressed into fairly small source 2008-09-13 15:30 in linux 2008-09-13 15:30 apparently used by 2008-09-13 15:31 nfs and ocfs2 2008-09-13 15:31 probably related to directory deletions in some way 2008-09-13 15:31 what does it do? 2008-09-13 15:31 so much hackery in linux is because of nfs 2008-09-13 15:31 we'd be way better off if it had never been written 2008-09-13 15:32 well, some people make a living from it 2008-09-13 15:32 so they are ok 2008-09-13 15:32 and they are generally good to drink with 2008-09-13 15:32 especially good to drink with 2008-09-13 15:32 I think there must be a connection 2008-09-13 15:32 nope renames 2008-09-13 15:33 right, dentry move 2008-09-13 15:33 actually I dimly recall that 2008-09-13 15:33 so basically this is something along the lines of support for atomic renames 2008-09-13 15:33 a big wart in dentry cache 2008-09-13 15:33 and somehow nfs and ocfs2 are special 2008-09-13 15:33 I wonder why ocfs2 needs it 2008-09-13 15:34 can ask mark fasheh about that 2008-09-13 15:34 FS will handle d_move() 99 * during rename() internally. 2008-09-13 15:34 from the header file for that #define 2008-09-13 15:34 okay, looks like at first glance (as expected) we just need a backing blockdev 2008-09-13 15:34 I strongly suspect it was to solve a locking bottleneck in ocfs2 2008-09-13 15:35 but worth knowing for future design that there are such hacks 2008-09-13 15:35 for rename and for stale . and .. 2008-09-13 15:36 yes 2008-09-13 15:36 probably want to avoid it, but... 2008-09-13 15:36 what's ocfs2? 2008-09-13 15:36 nice little cluster filesystem from oracle 2008-09-13 15:36 quite underrated 2008-09-13 15:37 question about filenames 2008-09-13 15:37 does the vfs layer enforce, no nulls and no slashes in a file name? 2008-09-13 15:37 but otherwise anything goes? 2008-09-13 15:38 okay nodev is out 2008-09-13 15:39 next step - what the hell is get_sb ;-) 2008-09-13 15:39 main: >>> found unatom entry 12 for atom 1 2008-09-13 15:40 now to print out the name 2008-09-13 15:41 main: found unatom entry 0 for atom 0 2008-09-13 15:41 main: found unatom entry 12 for atom 1 2008-09-13 15:41 main: found unatom entry 24 for atom 2 2008-09-13 15:41 main: found unatom entry 0 for atom 3 2008-09-13 15:41 main: found unatom entry 0 for atom 4 2008-09-13 15:41 etc 2008-09-13 15:42 well, it should not be 0 for unknown atoms 2008-09-13 15:42 probably 2008-09-13 15:42 oh 2008-09-13 15:42 sure it should 2008-09-13 15:42 unused entry in the unatom table 2008-09-13 15:46 unused entry in the unatom table 2008-09-13 15:46 whoops 2008-09-13 15:47 main: found unatom entry 0 for atom 0 2008-09-13 15:47 0xb7d10400: 00 00 00 00 0c 00 03 00 66 6f 6f dd 01 00 00 00 "........foo....." 2008-09-13 15:47 there we go 2008-09-13 15:47 reversed 2008-09-13 15:47 time to skate 2008-09-13 16:01 -!- caoliver(~oliver@75-134-208-20.dhcp.trcy.mi.charter.com) has joined #tux3 2008-09-13 16:02 6,800 lines 2008-09-13 16:02 only added about 300 including xattr support and atom refcounting 2008-09-13 16:02 xattrs will come in at around 500 lines total and be perfectly usuable 2008-09-13 16:02 superior maybe 2008-09-13 16:03 sk8 oclock 2008-09-13 16:03 really 2008-09-13 16:09 made further improvements to debugging: 2008-09-13 16:09 @/home/maze/junkfs/super.c:41 - Entering: init_junk_fs() 2008-09-13 16:09 @/home/maze/junkfs/super.c:26 - Entering: test(a=(int)5, b=(int)6) 2008-09-13 16:09 @/home/maze/junkfs/super.c:27 - Returning: test(...) = a + b = (int)11 2008-09-13 16:09 @/home/maze/junkfs/super.c:46 - Mark in init_junk_fs(...) err=(int)0 2008-09-13 16:09 @/home/maze/junkfs/super.c:49 - Returning: init_junk_fs(...) = 0 = (int)0 2008-09-13 16:09 @/home/maze/junkfs/super.c:55 - Entering: exit_junk_fs() 2008-09-13 16:10 @/home/maze/junkfs/super.c:57 - Returning: exit_junk_fs(...) = void 2008-09-13 16:10 I have to admit, your plan to write a fs to learn vfs is working out well 2008-09-13 16:10 spaces around the = please ;-) 2008-09-13 16:11 hmm, but those are like colons or something 2008-09-13 16:11 then colon with one space after 2008-09-13 16:11 lindent 2008-09-13 16:11 might as well get used to it 2008-09-13 16:11 well, this is text output from dmesg 2008-09-13 16:12 but yeah ': ' is probably better 2008-09-13 16:12 still 2008-09-13 16:12 it may well escape to lkml one day 2008-09-13 16:12 who knows 2008-09-13 16:12 @/home/maze/junkfs/super.c:26 - Entering: test(a: (int)5, b: (int)6) 2008-09-13 16:12 @/home/maze/junkfs/super.c:27 - Returning: test(...) = a + b = (int)11 2008-09-13 16:12 @/home/maze/junkfs/super.c:46 - Mark in init_junk_fs(...) err: (int)0 2008-09-13 16:13 does look better 2008-09-13 16:13 printk bytes are cheap ;-) 2008-09-13 16:13 yes 2008-09-13 16:13 easier on my eyes 2008-09-13 16:13 pleasant even 2008-09-13 16:13 changed already 2008-09-13 16:13 I noticed 2008-09-13 16:13 it was after all a 2 byte change 2008-09-13 16:13 changed, tested and pasted into the cloud 2008-09-13 16:13 that's the spirit 2008-09-13 16:13 I should probably setup a repository for this junkfs 2008-09-13 16:14 and work on getting a kvm debug working 2008-09-13 16:14 probably don't want to muck around with the get_sb stuff on my live box 2008-09-13 16:14 also have to figure out how to get compile junk to go elsewhere 2008-09-13 16:14 than the source dir 2008-09-13 16:16 yes 2008-09-13 16:17 cd your/source 2008-09-13 16:17 hg init 2008-09-13 16:17 hg add . 2008-09-13 16:17 hg commit 2008-09-13 16:17 that's all there is to it 2008-09-13 16:17 hg is mercurial? 2008-09-13 16:17 yes 2008-09-13 16:18 probably need to install it first then 2008-09-13 16:19 I'm really pleased with the xattr atom stuff 2008-09-13 16:19 awesome 2008-09-13 16:19 need to use some slight imagination to see how it will perform with a little cache in front of it, and to see the impact of atomic update/log rollup 2008-09-13 16:19 but otherwise I guess it's done 2008-09-13 16:19 some fiddling 2008-09-13 16:20 no more questions about potential lurking complexity 2008-09-13 16:20 and whether it can emulate straight ascii strings 2008-09-13 16:20 I don't think we need a option, really 2008-09-13 16:21 cool 2008-09-13 16:21 one thing missing: find a free atom 2008-09-13 16:21 to use 2008-09-13 16:21 instead of bindly generating new ones, need to code that 2008-09-13 16:22 the plan is to just let the thing expand up to some size, count the deletions in it, then when deletions/size exceeds a threshold, we rescan for deleted entries 2008-09-13 16:22 deleted atoms 2008-09-13 16:22 probably overkill 2008-09-13 16:22 an alternative is to put a linked list of free atoms in the unatom table 2008-09-13 16:22 better 2008-09-13 16:23 oh shiny - finaly rhel4.7 is out in centos 4.7 2008-09-13 16:23 yup free atom list is always better 2008-09-13 16:24 it shall be so 2008-09-13 16:24 will code that when I get back 2008-09-13 16:24 also need to code the atom table dump 2008-09-13 16:25 so some more fiddling until I can escape to more interesting things 2008-09-13 16:32 -!- caoliver(~oliver@75-134-208-20.dhcp.trcy.mi.charter.com) has left #tux3 2008-09-13 17:27 one slight drawback to my atable design I just noticed 2008-09-13 17:27 putting the tables up so high will make the radix tree quite deep 2008-09-13 17:27 I think 2008-09-13 17:27 so when I map at block 2^28 2008-09-13 17:28 radix tree has 2^5 fanout 2008-09-13 17:28 that is 6 radix tree levels 2008-09-13 17:28 probably nothing to worry about 2008-09-13 17:28 we zip through those very fast 2008-09-13 17:28 and with the hash in front of it, the overhead will disappear in the noise, if it was not already 2008-09-13 17:29 against that, we have the pleasing property of only having to sync one file to sync the entire atable including recounts and reverse map 2008-09-13 20:25 -!- tim_dimm(~mobile@32.156.233.244) has joined #tux3 2008-09-13 20:27 -!- tim_dimm(~mobile@32.156.233.244) has joined #tux3 2008-09-13 20:28 Howdy 2008-09-13 20:29 Got a geeky irc app for my phone 2008-09-13 20:29 :-) 2008-09-13 20:32 -!- tim_dimm(~mobile@32.156.233.244) has joined #tux3 2008-09-13 22:15 -!- MaZe(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #tux3 2008-09-13 22:15 wb maze 2008-09-13 22:15 hey 2008-09-13 22:15 show_freeatoms: next = dead000000000666 2008-09-13 22:16 linked list 2008-09-13 22:16 nontrivial proposition when it's linked through disk blocks 2008-09-13 22:16 see the cute magic number 2008-09-13 22:16 for deleted atom 2008-09-13 22:16 I've a question about scheduling priorities... [yes, cute indeed] 2008-09-13 22:17 I'm not much of a scheduler person but fire away 2008-09-13 22:17 so, if we have a pre-emptible kernel, and one low-priority (ie. niced) user process does something which results in a call to the fs code, what priority does that code run within the kernel? 2008-09-13 22:18 same 2008-09-13 22:18 niced 2008-09-13 22:18 will it also be effectively scheduled as a niced task? giving way to other threads of execution of higher priority? 2008-09-13 22:18 yes 2008-09-13 22:18 so does linux then automatically boost thread execution priority within the kernel if a low-prio thread is blocking a higher-prio thread by holding a lock? 2008-09-13 22:19 [and, can thread priority be manually temporarily increase/decreased/changed within kernel code - for whatever reason] 2008-09-13 22:20 there's some priority inheritance stuff, yes, but I'm not familiar with it 2008-09-13 22:20 ie. how does the linux kernel deal with std priority inversion jazz 2008-09-13 22:20 you can do whatever you want in kernel 2008-09-13 22:20 ahh 2008-09-13 22:20 including changing priority 2008-09-13 22:20 of your task or any other 2008-09-13 22:21 you can also fill the entire kernel with zero ;-) 2008-09-13 22:21 true - good point 2008-09-13 22:21 although I was trying to read the bootid uuid from my module, and that actually turns out to be very non-trivial 2008-09-13 22:21 didn't say anything was easy 2008-09-13 22:21 almost nothing is 2008-09-13 22:21 but you can do it 2008-09-13 22:21 since it's not exported, and I don't see a good way to grab sysctl's from within the kernel ;-) and the interfaces are always user-oriented 2008-09-13 22:22 [obviously here easiest solution is to fix random.c to export the boot_id... but that's not something that can be done in a module] 2008-09-13 22:22 lyou'll get frustrated about what is not exported, until you realize... just export it 2008-09-13 22:22 right 2008-09-13 22:22 if it's a stupid idea you'll find out soon enough 2008-09-13 22:22 it just won't compile on 'older' kernels then 2008-09-13 22:23 or get past linus usually 2008-09-13 22:23 following linked lists is always scary 2008-09-13 22:23 never feels like it's going to terminate 2008-09-13 22:23 it did: 2008-09-13 22:23 show_freeatoms: next = dead000000000666 2008-09-13 22:23 show_freeatoms: next = dead000000000000 2008-09-13 22:24 this time 2008-09-13 22:24 well there's basically a static char[16] with the bootid, it's got links to it, but... ugh 2008-09-13 22:24 notice the frist atom to die was number 0 2008-09-13 22:24 so 0 is a valid atom 2008-09-13 22:24 probably going to regret that 2008-09-13 22:24 yup ;-) 2008-09-13 22:24 I always make 0 invalid 2008-09-13 22:24 or free 2008-09-13 22:24 or something 2008-09-13 22:24 well 2008-09-13 22:25 don't make it invalid just because you're lame ;) 2008-09-13 22:25 make it invalid because you have a good reason 2008-09-13 22:25 I don't have a good reason for atom zero yet 2008-09-13 22:25 but there likely is one 2008-09-13 22:25 good reason: it's easier on the eyes when you later debug it 2008-09-13 22:25 you never expect a 0 value to actually be pointing/referencing to something 2008-09-13 22:25 the magic number there makes things pretty unambiguous 2008-09-13 22:26 actually with the magic number - sure 2008-09-13 22:26 depends 2008-09-13 22:26 it's without that I'd be worried 2008-09-13 22:26 zero is often a valid offset 2008-09-13 22:26 ie. before it's dead 2008-09-13 22:26 it is a valid dirent offset in ext2 for example 2008-09-13 22:26 yes, but offset is a seperate matter 2008-09-13 22:26 well 2008-09-13 22:26 they're all offsets 2008-09-13 22:26 ok, right it needs some deeper thinko 2008-09-13 22:26 there is no such thing as an absolute address any more ;) 2008-09-13 22:27 -!- tim_dimm(~timothyhu@adsl-67-114-40-138.dsl.scrm01.pacbell.net) has joined #tux3 2008-09-13 22:27 hi daddy_dimm 2008-09-13 22:27 chatting on your iphone? 2008-09-13 22:27 yup 2008-09-13 22:27 leet 2008-09-13 22:27 ulberleet 2008-09-13 22:27 uber even 2008-09-13 22:27 makin' sure it came through 2008-09-13 22:27 it did 2008-09-13 22:27 k, pissin off the wife now 2008-09-13 22:27 then you had to change a diaper or something 2008-09-13 22:27 I should go 2008-09-13 22:27 heh 2008-09-13 22:28 short leash 2008-09-13 22:28 it was ever thus 2008-09-13 22:28 it was your idea ;) 2008-09-13 22:28 got 2500 to change minumim if I do my part 2008-09-13 22:28 k 2008-09-13 22:28 yammer at ya later 2008-09-13 22:28 later 2008-09-13 22:28 toodles 2008-09-13 22:29 what's up with zumastor? 2008-09-13 22:29 ok, now I just need to allocate from that list, then I'm done for the night 2008-09-13 22:29 again, nontrivial 2008-09-13 22:29 when the list is linked through file blocks 2008-09-13 22:29 entirely different scale of hacking than in memory 2008-09-13 22:31 I see I have some buffer leaks to chase 2008-09-13 22:31 so... it's going to be a while 2008-09-13 22:31 before I can rest 2008-09-13 22:35 -!- stargazr5(~gauravstt@59.95.17.142) has joined #tux3 2008-09-13 22:38 http://lxr.linux.no/linux+v2.6.26.5/drivers/md/md.c#L595 2008-09-13 22:38 what the hell is that code doing - and why? 2008-09-13 22:38 isn't that spurious? 2008-09-13 22:38 never mind 2008-09-13 22:38 dealing with carry 2008-09-13 22:39 two iterations, yes 2008-09-13 22:39 still looks crappy 2008-09-13 22:41 it's the sort off stuff that is cleaner in assembler 2008-09-13 22:42 adc %ah,%al; adc 0,%al - or whatever the proper registers are called nowadays 2008-09-13 22:42 uhm, first one add, second adc 2008-09-13 22:43 and that's merely 16 bit -> 8 bit, not 32->16 2008-09-13 22:43 much cleaner 2008-09-13 22:46 okay, trying to read in a superblock now, using bios 2008-09-13 22:56 brave 2008-09-13 22:57 submit_bio()... then what? 2008-09-13 23:00 glacing through other fs'es 2008-09-13 23:00 and other kernel subsystem 2008-09-13 23:00 the submit code in swap.c seems promising 2008-09-13 23:00 also check block_read_full_page 2008-09-13 23:00 and friends 2008-09-13 23:01 you need to set up an endio that unlocks something 2008-09-13 23:01 wakes up your process typically 2008-09-13 23:01 and you have to remember what you are supposed to wake up somehow 2008-09-13 23:01 in the private field of the bio 2008-09-13 23:01 you will typically stick some state struct in there 2008-09-13 23:01 this is working on the metal 2008-09-13 23:02 :-) cool. 2008-09-13 23:02 you could try the prepare_to_sleep etc api here 2008-09-13 23:02 submit, then sleep 2008-09-13 23:03 see, that's why people tend to use submit_bh, because then you can do wait_on_buffer 2008-09-13 23:03 but it's a very crufty path 2008-09-13 23:08 http://lxr.linux.no/linux+v2.6.26.5/mm/page_io.c#L25 2008-09-13 23:08 that looks like it might be a decent example of asynch bio handling 2008-09-13 23:09 pretty good 2008-09-13 23:10 end page writeback will do a bunch of stuff you don't need 2008-09-13 23:11 but writebackk will only happen if I dirty the page right? 2008-09-13 23:11 actually, scratch that 2008-09-13 23:11 I'm still not quite sure, whether this interface is read/write or mmap or both 2008-09-13 23:11 you're in complete control 2008-09-13 23:12 when you go submit_bio, stuff starts to happen 2008-09-13 23:12 but you will not be able to use these functions directly 2008-09-13 23:12 just use as a guid to write your own 2008-09-13 23:12 right 2008-09-13 23:12 somebody ought to make a simple "read that into this page" 2008-09-13 23:12 can pages be shared between kernel and userspace? 2008-09-13 23:12 based on this 2008-09-13 23:12 but nobody has that I know 2008-09-13 23:13 only by mapping into a page table 2008-09-13 23:13 ie. both the kernel and userspace have a part of disk mmap'ed into phys memory? 2008-09-13 23:13 and whichever edits, the other sees? 2008-09-13 23:13 yes 2008-09-13 23:13 by doing the mapping through the pagetable? 2008-09-13 23:13 done all the time 2008-09-13 23:13 yes 2008-09-13 23:14 I'm taking baby steps here ;-) 2008-09-13 23:14 you have to ask the right way, get it setup correctly 2008-09-13 23:14 so that it will be recovered properly when your process exits 2008-09-13 23:14 and so on 2008-09-13 23:15 it's a big topic 2008-09-13 23:15 we're going to do "read" on tuesday 2008-09-13 23:16 that in itself is a big topic