31
votes
Perl CGI script to serve a PDF file
I will go through your code line by line and give feedback. We will skip the general advice on don't use CGI as it's actually suited for what you are trying to do here.
I wrote this answer in two ...
25
votes
Accepted
Explicit song lyrics checker
I recommend practicing Python 3 rather than Python 2 these days.
According to PEP 8, isClean() should be is_clean().
One or ...
25
votes
Accepted
Transcode UCS-4BE to UTF-8
Efficient file I/O
By default, files opened with fopen() are buffered, meaning that not every call to fread() or ...
24
votes
Transcode UCS-4BE to UTF-8
log is already declared in <math.h>. You don't need to declare it yourself. In fact, it could be harmful.
As stated in ...
22
votes
Accepted
92 Spoons AI, sort of an AI in C++
There's a lot to say here, so let's not waste any time and jump right into it:
You said that you didn't want to hear the using namespace std;-drill, but ...
21
votes
Explicit song lyrics checker
Splitting line into words using words = line.split(" ") will only split the line on space ...
17
votes
Accepted
A simple error messaging and logging system via macro(s) in C++
You’ve already noted that this could be done better without macros, so I won’t belabour the point. I will note, though, that your goal—“to refresh [your] skills at writing good solid macros”—makes ...
16
votes
92 Spoons AI, sort of an AI in C++
Here are some things that may help you improve your code.
Don't abuse using namespace std
I know you said you don't want to hear this, but your code is unusable ...
16
votes
Explicit song lyrics checker
Some suggestions:
Using argparse to parse arguments is preferable to any kind of interactive input, because it means that
it's much easier to automate and integrate ...
15
votes
Markov chains to generate text
Functions
Split the code into functions, also split the generation and the presentation. Your algorithm has some clear distinct tasks, so split along these lines:
read input
assemble chain
construct ...
15
votes
Program that finds the longest word not containing one of the disallowed characters
I see a number of things that may help you improve your program.
Fix the bug
There is a subtle bug in the original implementation. It contains these lines:
...
15
votes
Accepted
Average spam confidence
Not duplicating any of @Peilonrayz's code review points ...
Stop reading entire files into memory when you can process the file line by line in one pass, and stop creating huge lists in memory which ...
13
votes
Basic C copy file implementation
You test the return value of fwrite, which is good. However, fread may fail as well. Since ...
13
votes
Read file into vector<byte>
Nice:
using bytes = std::vector<std::byte>;
I would call it Bytes to make it clear it is a type rather than an object.
...
13
votes
Transcode UCS-4BE to UTF-8
This program reads 4 byte codepoints (in BIG ENDIAN) from a file strictly called "input.data" and creates another file called "ENCODED.data" with the relative encoding in UTF8.
Needless to say, ...
13
votes
Accepted
fsize: A Command-Line Tool for Checking File Sizes
Multiple puts() calls
For the --help message, I get why you made a puts() call for each line,...
12
votes
92 Spoons AI, sort of an AI in C++
Header files
You have a lot of implementation code in that header file (files.h). Header files should only contain the function or class headers, the implementation should go into an extra cpp file. ...
12
votes
Comparing two large binary files in C
fread (tmp1, sizeof *tmp1, readsz, fp1);
fread (tmp2, sizeof *tmp2, readsz, fp2);
count += 16;
if(memcmp(tmp1, tmp2, readsz)){
…
}
You are discarding the ...
12
votes
Accepted
Basic C copy file implementation
Prefer Symbolic Constants Over Magic Numbers
There is a header file that should be included, stdlib.h, that provides some standard symbolic constants such as EXIT_SUCCESS and EXIT_FAILURE. It might ...
12
votes
Average spam confidence
I think ask_file_name looks fine without using pathlib. The difference between the two comes down to LBYL vs EAFP.
For the most ...
11
votes
Accepted
CSV file reader in PHP that supports large files (>15k lines)
Performance
As performance is your main concern, let's face this first. To complete the example CSV-file with ~36k lines your original script needs around 139s*.
The main bottlenecks are ...
11
votes
Accepted
Most common words in a text file of about 1.1 million words
This is actually quite good. Good use of the collections module.
One improvement I can think of is switching to the ...
11
votes
11
votes
Accepted
Counting relevant entries in a large bioinformatics file
In order to speed this up, you'll need to avoid as many string creation operations as possible, because they are expensive. Especially the split operation is expensive. Not only does this create many ...
11
votes
Accepted
Comparing two large binary files in C
I'm trying to read two sufficiently large binary files, comparing them and printing the offset at which they differ.
compare_two_binary_files() is faulty for ...
11
votes
Accepted
A utility to swap two files
file1_path == file2_path surely tells that paths refer to the same file. However, even if file1_path != file2_path they still ...
11
votes
Robust program to write an array of certain data type to a binary file and read back from it (C++17)
using namespace std;
Don't do using namespace std;. It can lead to name collisions and other issues.
Typing out the full names (...
11
votes
Get histogram of bytes in any set of files in C++14
This key loop looks incorrect:
while (!stream.eof() && stream.good()) {
unsigned char ch;
stream >> ch;
histogram[ch]++;
}
If ...
10
votes
Counting relevant entries in a large bioinformatics file
If you are in for raw performance, try to avoid repeating potentially cost-intensive operations.
In this case, you split the lines twice with the same parameter, which repeatedly applies a regular ...
10
votes
Accepted
Writing a bitmap image from C++
Not bad! Let's go through it line-by-line.
typedef unsigned char BYTE;
This is fine, but just so you know, starting from C++17, there is a ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
file × 826python × 213
java × 163
c++ × 127
c × 125
beginner × 114
performance × 113
python-3.x × 74
io × 64
strings × 61
c# × 49
file-system × 42
parsing × 40
csv × 28
error-handling × 27
python-2.x × 26
algorithm × 25
console × 22
php × 21
image × 21
search × 20
ruby × 19
regex × 19
stream × 19
serialization × 19