[Etherlab-users] Best filesystem for DLS?

Sebastien BLANCHET blanchet at iram.fr
Wed Apr 28 19:17:44 CEST 2021


Hi,

Just to share the results for the best filesystems for DLS and millions 
of files:

I had reached the inodes limits (78M) on a 1TB ext4 filesystem with DLS.

I have switched to ZFS instead of ext4. And now it works well.

I have not tested XFS: I am more familiar with ZFS, and since it works, 
I just keep it.

Regards,
-- 
Sebastien BLANCHET

On 3/24/21 3:41 PM, Dr.-Ing. Wilhelm Hagemeister wrote:
> Hallo Sebastien,
> 
> Am 24.03.21 um 15:04 schrieb Sebastien BLANCHET:
>> Dear Wilhelm,
>>
>> You are right, there was a configuration error in job.xml the block size
>> was only 10. I have increased it to 2000 to get fewer files.
> 
> that is not really a configuration error. It is the default value ;-)
> depending on your sample frequency giving you a latency of 10 seconds.
> 
> A blocksize of 2000 for 1Hz means that new blocks get written only every
> approx half an hour (2000 s). So you only are able to see new data (in
> dlsgui) after this time. If that is to much latency, reduce the
> blocksize to 60 e.g.
> 
> Anyway greater blocksizes result in better compression and less overhead.
> 
> Increasing "Reduction" has more impact on file count, because it
> influences the creation of "meta levels". Doubling that value (default
> 30) should be no problem.
> 
> DLS is definitely the right tool to store hugh amount of data over a
> long period, but to be honest we never had jobs with 20k signals up
> until now.
> 
> It would be much appreciated if you later could share your experience
> with other filesystems.
> 
> Regards
> 
> Wilhelm Hagemeister
> 
>>
>> I will do some tests with other filesystems that can handle more files.
>>
>> I am sampling 20K signals at 1Hz without trigger and I keep them for one
>> year for technical forensic analysis.
>>
>> I do not know if dls is designed to operate at such scale, but it is the
>> easiest archiving tool I have ever found and it seems to support the
>> workload.
>>
>> Best Regards,
>> ---
>> Sebastien BLANCHET
>>
>>
>>
>> On 3/23/21 5:30 PM, Dr.-Ing. Wilhelm Hagemeister wrote:
>>> Hi Sebastien,
>>>
>>> increasing the number of inodes helps.
>>>
>>> dumpe2fs -h /dev/sdxx helps to see what are your limits now.
>>>
>>> Regarding dls please check, if you get restarts of the sampling
>>> processes due to sample time constrains. Usually this is the case with
>>> data sources form realtime processes with high jitter. This results in
>>> creating new chunks (and a lot of new files). So have a look in your
>>> channel directory (or in dlsgui) if you have lots of chunks which have a
>>> breaks of only a few seconds.
>>> Also you get a log entry in your messages (or journalctl) saying
>>> something like: "Time diff of xxx us (expected xxx us, error is percent
>>> xxx %)"
>>>
>>> Also increasing the numbers for blocksize and reduction in the channel
>>> setup decreases the number of files.
>>>
>>> Are you sampling constantly or just small chunks with a trigger for your
>>> job? If later is the case, try to increase chunksize by constant
>>> sampling over longer periods.
>>>
>>> "btrfs" and "xfs" have no inode limitation but we have no experience
>>> with that.
>>>
>>> Regards Wilhelm Hagemeister
>>>
>>> Am 23.03.21 um 17:05 schrieb Merkel, Amos:
>>>> Hi,
>>>>
>>>>
>>>> I have never tried, but couldn't you simply manually define a
>>>> higher bytes-per-inode ratio with mkfs.ext4 -i ?
>>>>
>>>> Arch-Wiki has some explanations on the topic:
>>>> https://wiki.archlinux.org/index.php/Ext4#Bytes-per-inode_ratio
>>>> <https://wiki.archlinux.org/index.php/Ext4#Bytes-per-inode_ratio>
>>>>
>>>>
>>>> Greetings,
>>>>
>>>> Amos
>>>>
>>>> ------------------------------------------------------------------------
>>>> *Von:* Etherlab-users <etherlab-users-bounces at etherlab.org> im Auftrag
>>>> von blanchet at iram.fr <blanchet at iram.fr>
>>>> *Gesendet:* Dienstag, 23. März 2021 16:35:28
>>>> *An:* etherlab-users at etherlab.org
>>>> *Betreff:* [Etherlab-users] Best filesystem for DLS?
>>>>    Hi,
>>>>
>>>> Short version:
>>>> -------------------
>>>> What is the best filesystem for DLS data ?
>>>>
>>>> Long version:
>>>> -------------------
>>>> I am running DLS to archive a lot of signals (about 20K), and I am
>>>> facing an issue with the number of files.
>>>>
>>>> I have a 1TB EXT4 filesystem to store the data. It is 50% full but there
>>>> is no more free inodes, because DLS creates many files and directories.
>>>> According to the output of “df -i”, DLS has already created about 78M of
>>>> files.
>>>>
>>>> I understand that I have to switch to another filesystem that supports a
>>>> bigger number of files.
>>>> And I wonder, what is the best filesystem for such a case.
>>>>
>>>> Regards,
>>>>>>>> Sebastien BLANCHET
>>>> -- 
>>>> Etherlab-users mailing list
>>>> Etherlab-users at etherlab.org
>>>> https://lists.etherlab.org/mailman/listinfo/etherlab-users
>>>> <https://lists.etherlab.org/mailman/listinfo/etherlab-users>
>>>>
>>>
>>>
>>
> 




More information about the Etherlab-users mailing list